Trending this week: Discover the Gated Multi-Layer Perceptron (gMLP) by Google Brain; Use the Graph Data Science Library to enhance your ML projects; Learn how to apply the Neural Style Transfer technique.
Every week we analyze the most discussed topics on Twitter by Data Science & AI influencers.
The following topics, URLs, resources, and tweets have been automatically extracted using a topic modeling technique based on Sentence BERT, which we have enhanced to fit our use case.
In this new publication of our series of posts dedicated to the technology watch, we will talk about:
- Amazing ML Technologies
- NLP Update
- Must-Know ML Concepts
Discover what Data Science and AI influencers have been posted on Twitter this week in the following paragraphs.
Amazing ML Technologies
This week, data science and AI influencers have shared some amazing machine learning technologies.
KDnuggets have posted about the following recently released technologies:
Gated Multi-Layer Perceptron (gMLP), a deep-learning model that contains only basic multi-layer perceptrons, was released by researchers at Google Brain. The main innovation in gMLP is a Spatial Gating Unit (SGU) which captures the interactions across sequence elements; this performs the same role as attention in a Transformer, but without requiring encodings for element positions. Using fewer parameters, gMLP outperforms Transformer models on natural language processing (NLP) tasks and achieves comparable accuracy on computer vision (CV) tasks.
On the ImageNet image classification task, gMLP achieves an accuracy of 81.6, comparable to Vision Transformers (ViT) at 81.8, while using fewer parameters and FLOPs. For NLP tasks, gMLP achieves a better pre-training perplexity compared with BERT, and a higher F1 score on the SQuAD benchmark: 85.4 compared to BERT’s 81.8 while using fewer parameters.
Graph Data Science Library, an enhanced graph library for data science that provides more than 50 algorithms, was unveiled by TigerGraph. Graph Data Science Library is an approvement of GSQL Graph Algorithm Library, the former collection of algorithms of TigerGraph, whose algorithms have been fine-tuned.
Here’s a summary of what’s new in Graph Data Science Library:
- Library collection: 20+ new algorithms, including embedding algorithms for graph ML, like Node2Vec and FastRP.
- Library structure and management: the library continues to be open-source, on GitHub. Its organization was improved, grouping algorithms by category, and placing each algorithm in its own folder with a README and ChangeLog file. The repository will use tags to identify major releases.
The Graph Data Science Library will continue to grow and improve to deliver high-performance and easy-to-use Graph Data Science and Machine Learning to everyone. One next step is to include Graph neural networks (GNNs) that represent what is arguably the ultimate integration of connected data analytics and machine learning, using the graph structure during the training process itself.
Various tweets have shared articles giving some updates on natural language processing (NLP).
Aran Komatsuzaki has shared a research paper titled “ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning”. This paper presents a corpus (ExMix) and a model (ExT5) which extend work on multi-task pretraining for large language models:
- ExMix covers 8 families of supervised natural language processing tasks (eg. summarization, classification) with three individual tasks in each of them. These tasks are all converted into one text-to-text task (like prior work does) in order to allow uniform training on multiple tasks;
- ExT5 is a version of the model T5 fine-tuned on ExMix.
Training on ExMix allows the model to reach the performance of the T5 model trained only on plain text with many fewer (approximately half) the number of steps and results in the building of a large pre-trained model ExT5 on which extensive evaluations were performed. This demonstrated that the multi-task pretraining clearly helps and improves on a strong baseline across many tasks, including tasks that have not been included in the ExMix training data like translation.
Bob E. Hayes has tweeted about Nvidia making Megatron 530B language model available to enterprises. Megatron 530B — also known as Megatron-Turing Natural Language Generation (MT-NLP) — is a massive language model that contains 530 billion parameters and achieves high accuracy in a broad set of natural language tasks, including reading comprehension, commonsense reasoning, and natural language inference.
Also, the following practical guides on NLP have been shared by KDnuggets:
A practitioner’s guide to named entity recognition (NER). This post gives an introduction to the concept of NER and presents some libraries implementing NER techniques in Python, namely: SpaCy, StanforNERTagger, and Stanford’s Named Entity Recognizer . This post also provides the codes showing how to use them.
A very useful post presenting Text Preprocessing Methods for Deep Learning. Here, NLP text processing methods are provided to help you enhance your pipelines not only in deep learning projects but in conventional machine learning models too.
This post also provides some code associated with the techniques presented.
Must-Know ML Concepts
Some tweets have shared articles on ML concepts every data scientist or ML engineer should know.
Ronald Van Loon has shared a transfer learning 101 that explores this popular deep learning approach. This post tackles the following points:
- What is it?
- How does it work?
- Why it is used?
- When should you use it?
- Approaches to transfer learning: Training a model to reuse it; Using a pre-trained model; Feature extraction
Finally, it provides a list of resources for further reading on transfer learning and its applications.
François Chollet has shared a guided tutorial demonstrating how to apply Neural Style Transfer with AdaIN. Neural Style Transfer is the process of transferring the style of one image onto the content of another, so it is a type of transfer learning. Here, Adaptive Instance Normalization (AdaIN) technique allows accelerated processing of the images, enabling arbitrary style transfer in real-time.
KDnuggets have shared a blog post talking about 5 Concepts You Should Know About Gradient Descent and Cost Function. This post discusses the importance of the gradient descent method in machine learning and gets you to learn more about this iterative optimization algorithm and how it is used to minimize a loss function.
Hope you enjoyed this post, keep reading our series! 😉