Top Twitter Topics by Data Scientists #22

Trending this week: EvalML library for AutoML; A graph-based text similarity measure that employs named entity information; Why AI struggles to grasp cause and effect.

Every week we analyze the most discussed topics on Twitter by Data Science & AI influencers.

The following topics, URLs, resources, and tweets have been automatically extracted using a topic modeling technique based on Sentence BERT, which we have enhanced to fit our use case.

Want to know more about the methodology used? Check out this article for more details, and find the codes in this Github repository!


This week, Data Science and AI influencers on Twitter have talked about:

  • ML Updates
  • AI Discussions
  • The Future Of Data Science Jobs

The following sections provide all the details for each topic.

ML Updates

This week, influencers have shared some updates on machine learning.

Here are some updates shared by KDnuggets:

A post presenting EvalML, a library for automated machine learning (AutoML) and model understanding, written in Python. This new open-source project has joined the Alteryx open-source ecosystem. EvalML can reduce the amount of effort it takes to get to an accurate model, saving time and complexity, as it: performs data checks to catch common problems with your data prior to modeling; performs data preprocessing and feature engineering steps out of the box; grants access to a variety of models and tools for model understanding. Links to the EvalML Github repo and documentation are also provided.

An article that summarizes a research paper published in 2017 titled “A Graph-based Text Similarity Measure That Employs Named Entity Information”. This post explains a novel technique for calculating Text similarity based on Named Entity enriched Graph representation of text documents. This new approach combines 3 main steps: extraction of named entities and top-ranked terms in the texts; graph representation of the extracted information; calculation specific graph measures for measuring the similarity between two graphs.

Dr. Ganapathi Pulipaka has shared:

An article talking about a machine learning framework that integrates multi-omics data to predict cancer-related Long non-coding RNAs (LncRNAs). LncRNA has been a novel candidate biomarker in cancer diagnosis and prognosis. This post proposes a new machine learning approach, namely LGDLDA (LncRNA-Gene-Disease association networks based LncRNA-Disease Association prediction), for disease-related lncRNAs association prediction based on multi-omics data, machine learning methods, and neural network neighborhood information aggregation.

AI Discussions

This week, influencers have shared some content about discussions on artificial intelligence.

Ipfconline has shared an article talking about Why AI struggles to grasp cause and effect. This post explains how machine learning algorithms, which have managed to outperform humans in complicated tasks such as go and chess, struggle to make simple causal inferences. It explains that causality remains a challenge for machine learning algorithms, especially deep neural networks.

On his side, Simon Porter has shared a study realized to identify AI technology implementation strategies and barriers across a wide range of industries. This study conducted over five years comprises a series of surveys and interviews of senior managers and executives, and also in-depth studies of five leading organizations. It resulted in a counterintuitive key takeaway: competing in the age of AI is not about being technology-driven per se — it’s a question of new organizational structures that use technology to bring out the best in people. The secret to making this work, they learned, is the business model itself, where machines and humans are integrated to complement each other.

Finally, Tamara McCleary has shared a post addressing What Artificial Intelligence Still Can’t Do. This post explains that yet today’s AI still has fundamental limitations. Relative to what we would expect from a truly intelligent agent, AI has a long way to go. Today, mainstream artificial intelligence still can’t: use common sense, learn continuously and adapt on the fly, understand cause and effect, and reason ethically. But, these limitations should be perceived as challenges that will be important to address in order to advance the state of the art in AI.

The Future Of Data Science Jobs

This week the data science and AI influencers shared content on the various shapes the current data science jobs can take up and the resources to skill-up for data scientists.

Carla Gentry shared a tutorial on introductory statistics for data science. The tutorial walks you through the basics of statistics and untangles all the buzzwords. She also shared an article on the top 10 data science projects for beginners.

Vin Vashishta mentions that there’s a skills gap between the capabilities in high demand and those most Data Scientists have. He shared a video which details the gap skills and how to use them to land a role in the field.

Ronald van Loon shared a checklist to track your Data Science progress. The checklist contains a list of skills in three categories — Entry level, Intermediate, and Advanced.

Finally, KDnuggets shared an article titled ‘data science is not becoming extinct in 10 years but your skills might’. The article talks about the history of data science and how real-world Data Science projects need iterative development. And how to stay in the data science game.

They also shared an article on the top data science skills in 2021. And another article explaining the Data Scientist, Data Engineer & Other Data Careers.