My name is Gengchen Mai, a third year MA/Ph.D. student at Space and Time for Knowledge Organization Lab, Department of Geography, University of California, Santa Barbara. My Ph.D. adviser is Prof. Krzysztof Janowicz. My major research field is focused on Geographical Information Science (GIScience), Geographic Information Retrieval, Machine Learning/Deep Learning, and Semantic Web. Right now, my research is highly focused on how to semantically enrich geospatial data and queries in a Geographic Information Retrieval framework by combining inductive and deductive methods.
Before I become a MA/Ph.D. Student at UCSB, I got my B.S. Degree in Geographic Information System from Department of Geographical Information Science, School of Resource and Environmental Sciences, Wuhan University. During my undergraduate study, my research topic, especially undergraduate thesis, is focused on Land Use/Cover Change (LUCC), spatial analysis and spatial statistics.
09/24/2015 – 11/31/2017
Cartography and GIS
ADCN: An Anisotropic Density-Based Clustering Algorithm for Discovering Spatial Point Patterns with Noise
09/04/2011 – 06/30/2015
Wuhan University, Wuhan, Hubei, China 430079
Department of Geographical Information Science
Geographic Information System
Prof. Shiliang Su
Tea Plantation Expansion in Southeast of China: Process, Driving Forces and Ecological Effect
5.Krzysztof Janowicz, Pascal Hitzler, Blake Regalia, Gengchen Mai, Stephanie Delbecque, Maarten Frohlich, Patrick Martinent, Trevor Lazarus. On the Prospects of Blockchain and Distributed Ledger Technologies for Open Science and Academic Publishing [Editorial]. Semantic Web Journal, in press.
4.Gengchen Mai, Krzysztof Janowicz, Yingjie Hu, Song Gao. ADCN: An Anisotropic Density-Based Clustering Algorithm for Discovering Spatial Point Patterns with Noise. Transactions in GIS, in press. DOI:10.1111/tgis.12313
3.Shiliang Su, Yaping Wang, Fanghan Luo, Gengchen Mai, Jian Pu. Peri-urban vegetated landscape pattern changes in relation to socioeconomic development. Ecological Indicators 46 (2014) 477–486.
2.Shiliang Su, Yi’na Hu, Fanghan Luo, Gengchen Mai, Yaping Wang. Farmland fragmentation due to anthropogenic activity in rapidly developing region. Agricultural Systems 131 (2014) 87–93.
1.Rui Xiao, Shiliang Su, Gengchen Mai, Zhonghao Zhang, Chenxue Yang. Quantifying determinants of cash crop expansion and their relative effects using logistic regression modeling and variance partitioning. International Journal of Applied Earth Observation and Geoinformation 34 (2015) 258–263.
14.Gengchen Mai, Krzysztof Janowicz, Cheng He, Sumang Liu and Ni Lao. POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset [Short Paper], In: Proceedings of GIR'18 Workshop co-located with ACM SIGSPATIAL 2018, Nov. 6 - 9, 2018, Seattle, Washington, USA.
13.Gengchen Mai, Krzysztof Janowicz, Bo Yan. Support and Centrality: Learning Weights for Translation-based Knowledge Graph Embedding Models, In: Proceedings of EKAW 2018, Nov. 12 - 16, 2018, Nancy, France.
12.Krzysztof Janowicz, Bo Yan, Blake Regalia, Rui Zhu, Gengchen Mai. Debiasing Knowledge Graphs: Why Female Presidents are not like Female Popes [Vision Paper], In: Proceedings of ISWC 2018, Oct. 8 - 12, 2018, Monterey, CA, USA.
11.Gengchen Mai, Krzysztof Janowicz, Bo Yan. Combining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines, In: Proceedings of SemDeep-4 Workshop co-located with ISWC 2018, Oct. 8 - 12, 2018, Monterey, CA, USA.
10.Gengchen Mai, Krzysztof Janowicz, Yingjie Hu, Song Gao, Rui Zhu, Bo Yan, Grant McKenzie, Anagha Uppal, and Blake Regalia. Collections of Points of Interest: How to Name Them and Why it Matters [Short Paper], In: Proceedings of Spatial big data and machine learning in GIScience Workshop at GIScience 2018, August 28 - 31, 2018, Melbourne, Australia.
9.Bo Yan, Krzysztof Janowicz, Gengchen Mai, Rui Zhu. xNet+SC: Classifying Places Based on Images by Incorporating Spatial Contexts, In: Proceedings of the 10th International Conference on Geographic Information Science (GIScience 2018), August 28 - 31, 2018, Melbourne, Australia.
8.Gengchen Mai, Krzysztof Janowicz, Sathya Prasad, Bo Yan. Visualizing The Semantic Similarity of Geographic Features [Short Paper], In: Proceedings of 21st Conference on Geo-information science (AGILE 2018), June 12 - 15, 2018, Lund, Sweden.
7.Blake Regalia, Krzysztof Janowicz, Gengchen Mai, Dalia Varanka, E Lynn Usery. GNIS-LD: Serving and Visualizing the Geographic Names Information System Gazetteer As Linked Data, In: Proceedings of ESWC 2018, June 3 - 7, 2018, Heraklion, Crete, Greece.
6.Bo Yan, Krzysztof Janowicz, Gengchen Mai, Song Gao. From ITDL to Place2Vec -- Reasoning About Place Type Similarity and Relatedness by Learning Embeddings From Augmented Spatial Contexts, In: Proceedings of the 25th International Conference on Advances in Ge-ographic Information Systems (ACM SIGSPATIAL 2017), November 7 - 10, 2017, Redondo Beach, California, USA.
5.Blake Regalia, Krzysztof Janowicz, Gengchen Mai. Phuzzy.link: A SPARQL-Powered Client-Sided Extensible Semantic Web Browser, In: Proceedings of 3rd International Workshop on Visualization and Interaction for Ontologies and Linked Data (VOILA2017) co-located with ISWC 2017, October 22, 2017, Vienna, Austria.
4.Gengchen Mai, Krzysztof Janowicz, Yingjie Hu, Song Gao. ADCN: An Anisotropic Density-Based Clustering Algorithm [Short paper], In: Proceedings of the 24th International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2016), October 31 - November 3, 2016, San Francisco Bay Area, California, USA.
3.Gengchen Mai, Krzysztof Janowicz, Yingjie Hu, Grant McKenzie. A Linked Data Driven Visual Interface for the Multi-Perspective Exploration of Data Across Repositories, In: Proceedings of 2nd International Workshop on Visualization and Interaction for Ontologies and Linked Data (VOILA2016) co-located with ISWC 2016, October 17 or 18, 2016, Kobe, Japan.
2.Song Gao, Rui Zhu, Gengchen Mai. Identifying Local Spatiotemporal Autocorrelation Patterns of Taxi Pick-ups and Drop-offs [Short paper], In: Proceedings of the 9th International Conference on Geographic Information Science (GIScience 2016), September 27 - 30, 2016, Montreal, Canada.
1.Krzysztof Janowicz, Yingjie Hu, Grant McKenzie, Song Gao, Blake Regalia, Gengchen Mai, Rui Zhu, Benjamin Adams, Kerry Taylor. Moon Landing or Safari? A Study of Systematic Errors and their Causes in Geographic Linked Data, In: Proceedings of the 9th International Conference on Geographic Information Science (GIScience 2016), September 27 - 30, 2016, Montreal, Canada.
7. Oral presentation (2018): xNet+SC: Classifying Places Based on Images by Incorporating Spatial Contexts, In GIScience 2018, Aug 27 - 31, 2018, Melbourne, Australia.
7. Oral presentation (2018): Collections of Points of Interest: How to Name Them and Why it Matters, In Spatial big data and machine learning in GIScience Workshop at GIScience 2018, Aug 27 - 31, 2018, Melbourne, Australia.
6. Oral presentation (2018): Visualizing The Semantic Similarity of Geographic Features, In Annual Meeting of AAG 2018: Artificial Intelligence and Deep Learning Symposium: Geospatial Semantics and Geo-Text Data Analytics II, April 10 - April 14, 2018, New Orleans, Louisiana, USA.
5. Oral presentation (2017): A Semantically Enabled Geographic Information Retrieval Framework by using Representation Learning: A Simple Case Study of DBpedia, In GIS Day@UCSB Geography, November 17, 2017, Santa Barbara, California, USA.
4. Oral presentation (2017): ADCN: An Anisotropic Density-Based Clustering Algorithm for Discovering Spatial Point Patterns with Noise, In Annual Meeting of AAG 2017: Spatiotemporal Symposium -- Big Spatiotemporal Data Discovery and Mining Session, April 5- April 9, 2017, San Francisco, California, USA.
3. Poster presentation (2016): ADCN: An Anisotropic Density-Based Clustering Algorithm, In 24th International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2016) , October 31 - November 3, 2016, San Francisco Bay Area, California, USA.
2. Poster presentation (2016): A Linked Data Driven Visual Interface for the Multi-Perspective Exploration of Data Across Repositories, In 12th Reasoning Web Summer School (RW 2016) co-located with 10th International Conference on Web Reasoning and Rule Systems (RR 2016) , September 5 - 9, 2016, Aberdeen, Scotland, UK.
1. Oral presentation (2016): Tea Plantation Expansion in Hangzhou, China: Process, Related factors & Ecological Effect, In 2016 Annual Meeting of AAG: The Quest to Map Plant Species Session, March 28- April 1, 2016, San Francisco, California, USA.
2018/06 - 2018/09
Many services that perform information retrieval for Points of Interest (POI) utilize a Lucene-based setup with spatial filtering. While this type of system is easy to implement it does not make use of semantics but relies on direct word matches between a query and reviews leading to a loss in both precision and recall. To study the challenging task of semantically enriching POIs from unstructured data in order to support open-domain search and question answering (QA), we introduce a new dataset POIReviewQA. It consists of 20k questions (e.g.“is this restaurant dog friendly?”) for 1022 Yelp business types. For each question we sampled 10 reviews, and annotated each sentence in the reviews whether it answers the question and what the corresponding answer is. To test a system’s ability to understand the text we adopt an information retrieval evaluation by ranking all the review sentences for a question based on the likelihood that they answer this question. We build a Lucene-based baseline model, which achieves 77.0% AUC and 48.8% MAP. A sentence embedding-based model achieves 79.2% AUC and 41.8% MAP, indicating that the dataset presents a challenging problem for future research by the GIR community. The result technology can help exploit the thematic content of web documents and social media for characterisation of locations.
Question Answering, Information Retrieval, Deep Reinforcement Learning
2017/03 - Present
In this work, An academic search engine has been developed on top of IOS LD Connect Knowledge Graph. Document Embedding and Knowledge Graph Embedding have been utilized to facilitate the searching for papers, authors, and reviewers. Note that this search engine has been adopted as the official academic search engine for IOS Press.
2016/10 - Present
In this work, we conceptualized and prototypically implemented a Linked Data connector framework as a set of toolboxes for Esri’s ArcGIS. We discussed from within a GIS, how to connect to Linked Data endpoints, how to use ontologies to probe data and derive appropriate GIS representations on-the-fly, how to make use of reasoning, how to derive data that is ready for spatial analysis out of RDF triples, and, most importantly, how to utilize the link structure of Linked Data to enable analysis.
2016/08 - Present
Funded by IOS Press
Alexandria Digital Library (ADL) is UC Santa Barbara Library's home for collections of digital research materials. This project aims at leveraging Semantic Web technologies, especially GeoSPARQL, to facilitate the spatial/no-spatial query of ADL Gazetteer and dynamically visualize the results. Right now, the ADL Gazetteer data has been held and managed in a modified version of Apache Marmotta triple store in which GeoSPARQL is enabled. Thanks to all the help from STKO lab members, two different interfaces have been established to facilitate users to interactively explore the geographic data in ADL Gazetteer which are list bellow:
ADL Map Interface:
This is a map interface to help users do spatial/no-spatial queries on ADL Gazetteer, like finding all the entities in "administrative region" class in current map layout whose label contains "paris". GeoSPARQL is used to do spatial queries. Acknowledge to Grant McKenzie and Bo Yan.
ADL Linked Data Visualizer:
2017/07/16 - 2017/07/18
In this work, a semantically enriched geospatial data visualization and searching framework and evaluated it using a subset of places from DBpedia. The resulting map, as a representation of the semantic distribution of these geographic features, is produced by using multiple techniques including paragraph vector and clustering. Next, an information retrieval (IR) model is developed based on the vector embedding of each geographic feature. The results are visualized using the semantic similarity-based map as well as a regular map. We believe such visualization can help users to understand latent relationships between geographic features that may otherwise seem unrelated.
2016/10 - Present
In this work, we developed an Linked Data Visualization Interface for IOS Press Linked Dataset. This interface can visualize the Linked Data Cloud as a graph and users can further explore the graph content using right click menu. The Left sidebar will display the information of the current entity. The irght sidebar helps users do "Relationship Finder" style search.
2015/09 - 2017/06
Density-based clustering algorithms such as DBSCAN have been widely used for spatial knowledge discovery as they offer several key advantages compared to other clustering algorithms. They can discover clusters with arbitrary shapes, are robust to noise and do not require prior knowledge (or estimation) of the number of clusters. The idea of using a scan circle centered at each point with a search radius Eps to find at least MinPts points as a criterion for deriving local density is easily understandable and sufficient for exploring isotropic spatial point patterns. However, there are many cases that cannot be adequately captured this way, particularly if they involve linear features or shapes with a continuously changing density such as a spiral. In such cases, DBSCAN tends to either create an increasing number of small clusters or add noise points into large clusters. Therefore, in this paper, we propose a novel anisotropic density-based clustering algorithm (ADCN). To motivate our work, we introduce synthetic and real-world cases that cannot be sufficiently handled by DBSCAN (and OPTICS). We then present our clustering algorithm and test it with a wide range of cases. We demonstrate that our algorithm can perform as equally well as DBSCAN in cases that do not explicitly benefit from an anisotropic perspective and that it outperforms DBSCAN in cases that do. We show that our approach has the same time complexity as DBSCAN and OPTICS, namely O(n log n) when using a spatial index and O(n 2 ) otherwise. We provide an implementation and test the runtime over multiple cases. Finally, we apply DBSCAN, OPTICS, and our ADCN to the extraction of urban Areas of interest (AOI) from geotagged photos in six cities. Visual comparison shows that, comparing to DBSCAN and OPTICS, ADCN is inclined to extract AOI with linear shapes which follow the underline road network. ADCN also turn out to connect areas when the spatial distribution of them shows similar direction.
Data & Platform:
2016/01 - 2016/04
Funded by NSF EarthCube program
GeoLink is a building block project of EarthCube project. It aims at building up a oceanography data integration framwork for seven data repositories and a collections of Ontology Design Patterns. My contribution in this project is creating a graph visualizer to query the paths between two entities. This graph visualizer help the user to discover the data by follow-your-nose search across different data repositories.
2014/01 - 2015/07
Funded by the Fundamental Research Funds for the Central Universities (No. 2042014kf0048)
2014/01 - 2015/12
Funded by Open Research Fund Program of Key Laboratory of Digital Mapping and Land Information Application Engineering, National Administration of Surveying, Mapping and Geoinformation (No. GCWD201404)
Recently, tea plantation expansion has become a typical land use change in the subtropical zone of China. My experiment integrated remote sensing, spatial analysis deriving from geographic information system, landscape metric analysis and spatial regression, to quantify the socioeconomic indicators of tea plantation expansion and its effects on landscape pattern, with a case of Hangzhou, China, from 2004 to 2013. Main results showed that: (1) Hangzhou has undergone great tea plantation expansion, about 54975.9 ha, from 2004 to 2013. (2) Tea plantation expansion is highly related to some physical, social, and economical factors: slope, elevation, distance to water bodies, distance to roads, distance to socioeconomic centers, public financial revenue and per capita average income of farmers. (4) Tea plantation expansion would make the landscape become fragmentized, complex and irregular. Our study contributed to understanding the socioeconomic indicators of tea plantation expansion and its effects on landscape pattern in subtropical China.
2013/05 - 2015/06
Funded by Planning Project of Innovation and Entrepreneurship Training of National Undergraduate of Wuhan University (No.1310486034)
As a project leader of this undergraduate training project, I developed a C# Application to get check-in data form Sina-blog. And then we evaluated the effect of the construction of Wuhan Subway Line 2 on the spatial distribution of Sina-Blog Check-in data.
2014/12 - 2015/01
I made two maps of Hengche County by ArcGIS Deasktop. Left one shows the current land use map of Hengche County. The right one shows the Current land use stucture of Hengche County.
Software Development Internship
06/27/2017 - 09/15/2017
Redlands, CA, USA
Elasticsearch, Mechine Learning/Natural Language Processing (Artificial Neural Network, Named Entity Recognition), QML
2018/10/08 - 2018/10/12
NSF Student Travel Awards for ISWC 2018
2018/08/27 - 2018/08/31
ESRI GIScience 2018 Student Travel Awards
2018/08/27 - 2018/08/31
The Jack & Laura Dangermond Travel Scholarship for GIScience 2018
2018/06/12 - 2017/06/15
The Jack & Laura Dangermond Travel Scholarship for AGILE 2018
2018/04/10 - 2018/04/14
The Jack & Laura Dangermond Travel Scholarship for 2018 AAG Annual Meeting
2018/03/01 - 2018/03/02
NSF Student Fellowship for U.S. Semantic Technologies Symposium 2018 (US2TS 2018)
2017/11/07 - 2017/11/10
The Jack & Laura Dangermond Travel Scholarship for ACM SIGSPATIAL 2017
The 1st Place Best Paper Award at AAG 2017 GIS Special Group Student Paper Competition for "Beyond Coordinates: Incorporating Geographic Knowledge into Geocoding Services Using Linked Open Data" (co-author)
2017/04/05 - 2017/04/09
The Jack & Laura Dangermond Travel Scholarship for 2017 AAG Annual Meeting
2016/10/31 - 2016/11/03
The Jack & Laura Dangermond Travel Scholarship for the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2016)
2016/09/27 - 2016/09/30
The Jack & Laura Dangermond Travel Scholarship for the 9th International Conference on Geographic Information Science (GIScience 2016)
2016/09/05 - 2016/09/09
NSF Student Fellowship for 12th Reasoning Web Summer School (RW2016)
2016/09/05 - 2016/09/09
The Jack & Laura Dangermond Travel Scholarship for 12th Reasoning Web Summer School (RW2016)
2016/03/29 - 2016/04/02
The Jack & Laura Dangermond Travel Scholarship for 2016 AAG Annual Meeting
UCSB Geography Doctoral Scholars Fellowship
Outstanding Undergraduate of Wuhan University
2013/09 - 2014/06
China National Fellowship
2013/09 - 2014/06
2012/09 - 2013/06
China National Fellowship
2012/09 - 2013/06
2011/09 - 2012/06
China National Fellowship
Program Committee Member, 9th International Conference on Knowledge Capture, Dec. 4th-6, 2017, Austin, USA.