From: Location inference for hidden population with online text analysis
 | The Gazetteer-based Method | Part-of-speech (POS) Tagging | Named Entity Recognition (NER) |
---|---|---|---|
Features | Identifying geographical names according to external location knowledge (e.g., dictionary containing names of cities and states) | Recognizing geographical terms in a corpus based on the part of speech of its component words, according to both their definitions and contexts | Identifying and classifying words mentioned in unstructured corpus as pre-defined entity classes, i.e., persons, locations, organizations, etc. based on HMM models |
Strengths | It is a popular approach when looking for locations in Web text [45]; The algorithm is simple and easy to implement | Part-of-speech information is a pre-requisite in many NLP (Natural Language Processing) algorithms | The algorithm is fast, and suitable for processing large-scale datasets |
Limitations | Largely relies on the gazetteer, and easily affected by external geographic databases [46,47,48] | Vulnerable to linguistic errors and idiosyncratic style [38]; Algorithm accuracy is relatively low | Cannot identify names of local streets or buildings, non-standard place abbreviations and misspellings which are common in microtext |