Dr. David S. Batista

is working from home. 🏡

Angestellt, Lead Natural Language Processing Engineer, Comtravo

Berlin, Deutschland

Über mich

Experienced in both research and industry I enjoy working on solutions from concept to production and transform natural language text into structured data. In the past I've tackled problems with strong Machine Learning and Natural Language Processing components, involving tasks like: information extraction, classification, clustering and information retrieval. I considered myself a practical problem solver and like to deliver production ready software, not just results. • Homepage: http://www.davidsbatista.net • GitHub: https://github.com/davidsbatista • Publications: http://goo.gl/uihrcx

Fähigkeiten und Kenntnisse

Natural Language Processing
Machine Learning
Python
Software Development
Text Mining
Deep Learning
Data Mining
Data Science
Web Services
AWS
SQL
Computer Science
Natural Language Processing (NLP)
Elasticsearch
PyTorch

Werdegang

Berufserfahrung von David S. Batista

  • Bis heute 3 Jahre und 1 Monat, seit Juni 2021

    Lead Natural Language Processing Engineer

    Comtravo

    • Leading the Automation team working on the system that automatically answers incoming email travel requests and assists travel-agents in handling them. • Developed several modularised Python components with type-annotations, building algorithms to map input text to corresponding unique identifiers in a target knowledge base, e.g: airports, train stations, hotels, geographic locations. • Trained and evaluated models for text classification and fine-grained NER, increasing the performance of the system

  • 1 Jahr und 8 Monate, Okt. 2019 - Mai 2021

    Senior Natural Language Processing Engineer

    Comtravo
  • 2 Jahre und 2 Monate, Aug. 2017 - Sep. 2019

    Natural Language Processing Engineer

    Comtravo
  • 1 Jahr und 6 Monate, Jan. 2016 - Juni 2017

    Data Engineer

    HelloFresh

    • Built and maintained several ETLs using PySpark (Apache Spark) and Hive. • Developed the first prototype to manage ETLs pipelines based on Airflow operators which later went into production and was used by the team. • Built a classifier using NLTK and linear models from scikit-learn, to identify customer reviews mentions to different types of issues with the meal kits • Technologies: Python NLTK, scikit-learn, PySpark, Hive, Airflow

Ausbildung von David S. Batista

  • 2011 - 2015

    PhD - Information Extraction and Natural Language Processing

    Instituto Superior Técnico

Sprachen

  • Portugiesisch

    Muttersprache

  • Englisch

    Fließend

  • Deutsch

    Fließend

21 Mio. XING Mitglieder, von A bis Z