Since 2013, I am teaching portions of ANU COMP4650/6490: Document Analysis, Semester 2 (Jul-Nov) at the Australian National. The course covers four broad areas: (A) information retrieval, (B) natural language processing applications, (C) natural language processing in practice, (D) and machine learning for documents. Basic tasks here are covered including content collection and extraction, formal and informal natural language processing, information extraction, information retrieval, classification and analysis.

On going projects

Translating natural language sentences into Deontic Rules, with Guido Governatori. REGTECH DATASET
Clinical Psychology and NLP, with Brendan Loo Gee and Luis Salvador-Carulla.
Taxonomy Induction, with Rogelio Nazar.

Completed projects

Tao Ni and Qing Wan. Open Relation Extraction. Honors Thesis, Australian National University.
Bing Bo and Danielle Barth. Topic Modelling on Endangered Language: Matukar Panau and Tok Pisin. Data61, CSIRO, and Australian National University
Yufeng and Danielle Barth. Text Classifiction with Small Data: the case of Endangered Language Matukar Panau. Data61, CSIRO, and Australian National University
Xiang Li and Hanna Suominen. Transformer Semantic Parser. Data61, CSIRO, and Australian National University
Zoe Piper and Rebecca Hinton. FindHer in the ExpertConnect Platform. Data61, CSIRO, Australia.
Alvin Kenardi. Transfer Learning for Semantic Parsing. Individual Research Project, Australian National University.
Xiang Li. Neuronal Semantic Parsing. Data61, CSIRO, Australia.
Marian-Andrei Rizoiu, Hanna Suominen, Svitlana Chernykh, Keith Dowding, Richard Frank, Benjamin E. Goldsmith, Charles Miller. Hate Speech Detection.. Data61, CSIRO and Australian National University
Lizhen Qu, Liyuan Zhou and Weiwei Hou. Deep Learning for Information Extraction.. Data61, CSIRO and Australian National University.
Hanxiu Chen and Priscilla Kan Joh. Sentence Representation. Individual Research Project. Australian National University.
Shenjia Ji. How to tell real from fake. Honors Thesis, Australian National University.
Patent technologies: prior art search, automatic patent classification, trademark search.
Jaume Nualart Cross-reads: data exploration and visualization. Dilesha Nilakshi Seneviratne Dissanayake Wasala Mudiyanselage Hakmana Walawwe. Patent Wikification. PhD candidate, Queensland University of Technology, Brisbane, Australia.
Honggu Lin. A Patent Retrieval and Visualization Case Study. Individual Research Project. Australian National University.
Mona Golestan Far. Patent Prior Art Search. MPhil, Australian National University.
Shichao Dong. Improving Patent Claim Readability Through a Linguistically Motivated Indentation. Honors project, Australian National University.
Donglu Wang. Automated Classification of Human Genome Sequences in Patent Documents. Engn R&D Research project, Australian National University.
Duong Nhu. - Verbose Patent Classification. Swinburne University, Melbourne.
Donglu Wang,Patent Simplification co-supervised with Hanna Suominem. ANU.
TALN Research Group (Barcelona) TOPAS Tool Platform for Intelligent Patent Analysis and Summarization. Funding: FP7-SME-2011 286639, European Commission
TALN Research Group (Barcelona) HARenES Writing Tool Helper for Collocation Processing. Fundig: FFI2011-30219-C02-02, Spanish Ministry of Science
TALN Research Group (Barcelona) ColocaTe - Development of tools for collocations learning. Funding: FFI2008-06479-C02-01, Spanish Ministry of Science
TALN Research Group (Barcelona) PATExpert - Advanced Patent Document Processing Techniques. Funding: FP6 028116, European Commission
TALN Research Group (Barcelona) i3media - Technologies for the automatic creation and management of intelligente audiovisual content. Funding: National Institut Universitari de Lingüística Aplicada (Barcelona) Corpus Multilingüe de Llenguatges Especialitzats [Multilingual Corpus of Specialized Discourse]. Funding: CS93-4009 (from 1993-96), CREL (from 1997-2000) and IULA-UPF (from 1993-2006).


Bokum Kong. Russian Trolls. Individual Research Project (2019). ANU.
Jiaqi Zhang. How to Tell Real from Fake: research on text generation and linguistic feature in text Classification. Individual Research Project (2019) ANU.
Zhengjie Wang. Utilizing Edits And Low Resource Approaches For Grammatical Error Correction. Honors dissertation (2018). ANU.
Song, Ziwenhan. Multiword Expression Aware Dependency Parsing (2018). ANU.
Vincent Au. Utilising Word-Level Features for Improving Grammatical Error Detection. Honors dissertation (2018). ANU.
Ziwenhan Song. Multi-word Expression Aware Depedency Parsing. Individual Research Project (2018). ANU.
Wei Chu. Multi-word Expression Recognition with Few Short Learning. Individual Research Project (2018). ANU.
Yixuan Ni. Deep Checker: Deep Learning Based Grammar Checker. Honors dissertation (2017). ANU.
Quyu Kong. Modeling Information Diffusion in Social Networks. Master thesis (2017). ANU.
Zhuang Li. Deep Learning based Relation Extraction. Honors dissertation (2015). ANU.
Sarah Bull. Data Mining in IP Australia. Honor dissertation (2013). ANU.

Community Engagement

Code Like a Girl: Talk Nerdy. Event participation and networking. April 2019.
Science in ACTion Career advisor in Science and Technology. Sep 2017.
Girls in ICT Day A one day event held on Australian National University campus to encourage girls in years 9-12 to explore the fields of computing and technology. April 2016.
National Science Week Career advisor in Science and Technology, 2015.