Big Data and NLP with Python: 2-in-1
Gain valuable insights from your data by streamlining unstructured data pipelines with Big Data and NLP in Python
Description
Natural language processing and Big Data are the most interesting subfields of data science. You will learn to use the most popular programming language, Python with the latest Big Data technology, Apache Spark. If you're a data science professional who is familiar with Python and wants to take first steps in the world of data science by acquiring NLP and Big Data skills, then this learning path is for you.
This comprehensive 2-in-1 course teaches you how to efficiently ingest, query, and analyze data using MongoDB and Spark. You will also learn practical NLP techniques and methods to analyze your text data. It’s a perfect blend of concepts and practical examples which makes it easy to understand and implement. It follows a logical flow where you will be able to build on your understanding of the different Big Data and NLP techniques with every section.
This training program includes 2 complete courses, carefully chosen to give you the most comprehensive training possible.
The first course, Working with Big Data in Python, starts off with explaining the use of MongoDB, how it differs from SQL and structured data, and setting up your first database and query. You will then learn how to make use of MongoDB and Python such as including the pyMongo library, retrieving results from MongoDB cursors, and building up complex aggregation pipelines using operators. You will also work on an example which builds a data pipeline using PyMongo. Next, you will be introduced to Spark as the main software framework for working with large datasets across distributed computing resources. Finally, you will explore another live example of a data science workflow using MongoDB and Spark which includes the analysis of Reddit comments and machine learning task to predict comment popularity.
The second course, Next Generation Natural Language Processing with Python, begins with explaining how NLP can help you extract useful information from large collections of text data, and how you can use the latest Python libraries for NLP. You will then learn how to solve a practical problem using NLP by building a spam SMS detector. You will also learn to convert words into numbers that can be analyzed. Next, you will learn how to accurately label new documents to get an accuracy score and cluster your data together. You will be glanced through more advanced analysis wherein you will learn to model text by using vector space models and semantic parsing to break down the components of a sentence. Finally, you will work with neural networks and learn how to write believable text.
By the end of this Learning Path, you’ll be able to use the latest libraries of Big Data and NLP in Python for your day-to-day data science tasks.
Meet Your Expert(s):
We have the best work of the following esteemed author(s) to ensure that your learning journey is smooth:
- Alexis Rutherford is a Research Scientist at MIT Media Lab. He has a PhD in Physics and nearly 10 years of experience of using Python for data analysis and modeling gained at the United Nations, Facebook, and elsewhere. He has tackled many problems using data analysis including epidemiology, ethnic violence, vaccine hesitancy, and constitutional change and has built pipelines for social media data, legal documents, and news articles among others. He blogs and tweets regularly on data science and data privacy.
What You Will Learn!
- Learn how to efficiently ingest, query, and analyze data using MongoDB and Spark
- Learn practical NLP techniques and methods to analyze your text data
- Write MongoDB queries using operators and chain these together into aggregation pipelines
- Get to grips with powerful new libraries such as Gensim, Spacy, and Keras
- Perform different techniques to categorize text data
- Extract meaning and insights from text data such as vector space models
Who Should Attend!
- This Learning Path is for data engineers, data scientists, researchers, and developers who wish to know how to efficiently ingest, query, and analyze data using MongoDB and Spark.