Categories
Articles

Top Twitter Topics by Data Scientists #42

Trending this week: Attend NYU Deep Learning SP21 course by Yann LeCun; Know the top-38 Python libraries for data science by KDNuggets; Try DataSpell — the IDE for professional data scientists.

Every week we analyze the most discussed topics on Twitter by Data Science & AI influencers.

The following topics, URLs, resources, and tweets have been automatically extracted using a topic modeling technique based on Sentence BERT, which we have enhanced to fit our use case.

Want to know more about the methodology used? Check out this article for more details, and find the codes in this Github repository!

Overview

In this new publication of our series of posts dedicated to the technology watch, we will talk about:

  • Amazing DL & ML Resources
  • Very Useful Coding Tools for Data Scientists 

Discover what Data Science and AI influencers have been posted on Twitter this week in the following paragraphs.

Amazing DL & ML Resources

This week, some Data Science and AI influencers have shared some amazing resources on deep learning (DL) and machine learning (ML). They will help you to learn the basics of DL and ML and to go beyond these introduction concepts.

Mike Tamir has posted the following resources:

A series of lectures by Yann LeCun titled NYU Deep Learning SP21. This course concerns the latest techniques in deep learning and representation learning, focusing on supervised and unsupervised deep learning, embedding methods, metric learning, convolutional and recurrent nets, with applications to computer vision, natural language understanding, and speech recognition.

This series of 30 lectures are divided into 7 main themes:

  • Parameters sharing
  • Energy-based models, foundations
  • Energy-based models, advanced
  • Associative memories
  • Graphs
  • Control
  • Optimization

The GitHub repo ML-For-Beginners by Microsoft. This 12 weeks course contains 26 lessons and 50 quizzes on classic machine learning for all, using primarily Scikit-learn as a library and avoiding deep learning, which is covered in their forthcoming “AI for Beginners” curriculum. This rich repo will help use a project-based pedagogy that allows you to learn while building. 

Learners have just to fork the entire repo to their own GitHub account and complete the exercises on their own or with a group. 

Finally, some suggestions have been included for teachers on how to use this curriculum.

A guide to Knowledge Graphs. This post consolidates some notes that briefly but gently introduce Knowledge graphs and shine a light on several practical aspects. It covers the what, why, and how of the knowledge graph, and it also goes through some real-world examples.

On his side, Kirk Borne has shared a link to the Data Science Central Search Engine where you can find links to more than 100 articles and resources on neural networks. You will find almost every piece of information on deep learning that you need there.

Very Useful Coding Tools for Data Scientists

Some very useful tools and libraries that will enhance your data science skills and boost your projects have been shared.

We have retained the following collection by KDnuggets:

A Top 38 Python Libraries for Data Science. This article compiles the 38 top Python libraries for data science, data visualization & machine learning, as best determined by KDnuggets staff. The list comprises selected tools for:

  • Data processing: Spark, Pandas, Dask;
  • Maths: Scipy, Numpy;
  • Machine learning (ML): sklearn, XGBoost, LightGBM, StatsModels, Prophete, and others;
  • AutoML: auto-sklearn, Hyperopt-sklearn, SMAC-3, scikit-optimize, and others;
  • Dataviz: Matplotlib, Plotly, Seaborn, Bokeh, and others;
  • Explanation and exploration: eli5, LIME, SHAP, YellowBrick, pandas-profiling.

A post that introduces Git and GitHub for beginners. This guide is to help every beginner to harness their skills and have an easy time learning and using these tools. The topics covered are:

A blog post that provides you with 8 Cool VS Code tips to make your workspace more personal. In this post, you will learn about VS Code and 8 cool tips to make your workflow even more efficient while using VS Code.

A post that introduces DataSpell — A New Amazing IDE for Data Science. DataSpell is an IDE for Professional Data Scientists offered by Jetbrains, it is made exclusively for data scientists. It is in a preview version, but you can sign up for it here.

This post demonstrates what DataSpell offers, and goes over the basic introduction, creating your first notebook, smart code assistance, database support, markdown, and much more.

DataSpell is presented as the right alternative to JupyterLab and PyCharm if most of what you do is purely data science.

AI in Fintech

This week the data science and artificial intelligence influencers on Twitter spoke extensively about the application of AI in the Fintech domain.

ipfconline has shared a very interesting article on How Transformer-Based MachineLearning Can Power Fintech Data Processing. The article mentions that developing data-driven fintech products means dealing with high volumes of complex and, at times, unstructured data. Natural language processing mechanisms like classification and named entity recognition are crucial to turning disparate or unstructured transaction information into data sets that can be analyzed much more efficiently. Once processed, this data can be used for various applications in Fintech.

Ronald Van Loon has shared an article titled ‘Why Measuring Tech ROI Can Be Complicated and How to Simplify It.’ The article talks about the implementation of AI and other automation technologies to make daily work operations more efficient. To understand the true ROI of AI, organizations must ask themselves some key questions.

Finally, Kirk Borne has shared a very interesting article on Integrating IoT Data with Digital Twin Knowledge Graph. The creation of a DigitalTwin knowledge graph data model confronts the need for access to measurement data in order that the DigitalTwin can create timely performance metrics, identify promptly performance issues, and so on.

Hope you enjoyed this new post of our technology watch series of articles. Stay tuned!