When someone mentions 'Hugging Face' π€ in the world of AI, and machine learning, we go along with it even if we do not know what it is.
We just know that we can find AI models at Hugging Face. That's it.
But, is that it? Why should you use the platform? And, how does it work?
I tell you more about it here.
First things first:
Hugging Face is a website where you can upload, download, and find countless pre-trained AI models (of various licenses).
To put it in a wider perspective, imagine GitHub/GitLab but replace source codes with datasets and AI models. You may also think of Hugging Face as DockerHub equivalent to AI. DockerHub has Docker images published for the world and HugginFace has AI models.
In other words, it is a paradise on earth for people who are into machine learning and Generative AI.
There are 9,00,000+ models on the platform, and you can easily utilize each one of them on your system as per their usage instructions, and license requirements.
But, what makes Hugging Face so special? Let us look take a closer look at it.
History of Hugging Face
It takes a lot to turn a chatbot project into a database of AI models.
Well, that is precisely what Hugging Face did. It all started in 2016, Hugging Face's initial goal was to make an AI Interactive chatbot for teenagers.
But when the company decided it wanted to go open source with its bot, their direction for the future changed. They decided to become a driving force in the AI community. And, pitched themselves as a central hub for AI models.
In 2023, the company was valued at $4.5 Billion, with contributions from giants like Amazon, Google, and Qualcomm.
Now, you cannot mention most AI models without mentioning Hugging Face.
Here's Why Hugging Face is a Game changer
With various models out there, it is incredibly challenging to collaborate and contribute to machine learning projects.
Enter Hugging Face, whether you are a complete beginner or a veteran, it made AI models accessible to everyone. This time it was not just to extract their services, but to also dive deep into the backend of the models.
Hugging Face made AI a bit more open in nature (even if not all the models are open-source). While the biggest companies were discussing over issues with their models, the community had already found a solution.
But that is not where it stops, Hugging Face also allowed people to host their AI models. This led to easier collaboration with other people, resulting in more efficient models. Moreover, you could just run the models using the inference API for a quick demo, and do more with it as you scale up.
Along with that, you can also build your portfolio, showcasing your contributions and activities across AI models.
Furthermore, it helps startups and companies of various sizes to easily deploy their AI models using a flexible hourly pricing plan.
Now that we know enough about how big of a difference Hugging Face made in the industry, what can you actually use it for? What kind of models do you find there?
Models on Hugging Face
While there are countless models available on the platform, I have listed the most prominent and useful types.
1. Transformer Models
A type of deep-learning model fundamental for Natural Language Processing (NLP). These models can translate text and speech in real life, making them super useful in Artificial intelligence.
A transformer model must first be trained on a sizable text dataset to be used. Thankfully, Hugging Face provides you with a pre-trained model. You can utilize the trained model for various natural language processing tasks.
A transformer model, for instance, can be used to create text, classify text, or respond to queries.
2. Image Classification Models
From dealing with MRI scans to image searches on your web answers, you have definitely made use of image classification models somewhere or the other. These models have found a use in almost every field.
These are algorithms that classify photos into pre-established groups or categories.
3. Image Generation Models
Whether you are a working professional, or just use artificially intelligent models for fun, almost everyone has heard of projects like Stable Diffusion. It is an image-generation model that generates images based on the prompts you provide.
You can expect similar AI models focused on generating images on Hugging Face.
4. Time Series Forecasting Models
Forecasting models are perhaps one of the most, if not the most, widely used models in financial and industrial applications. These models are capable of predicting the future based on historical data.
These models are essential for following market trends, and keeping up with customer demands. While it is extremely useful, these models require a lot of data to set up and are harder to set up in real-time. Thankfully, Hugging Face makes our lives just a tad bit easier.
One of the most downloaded models on Hugging Face is the Chronos T5, which is a time series forecasting model.
5. Voice Activity Detection Models
Have you ever wondered what kind of algorithm your Google Assistant or Siri uses to recognize and distinguish your voice? Half of the credit goes to Voice Activity Detection Models, or VAD in short.
The main purpose of these models is to distinguish between audio where speech is present and where speech is absent. This helps in speech recognition accuracy and preprocessing raw audio data.
Hugging Face provides you with the capability of creating many varieties of AI models. With the help of their database, you can create your favorite projects or help your business run.
At face value, Hugging Face seems like a perfect idea, but when you dive deeper into their ways, you might encounter some reasons for concern.
Hugging Face Isn't Flawless
Earlier this year, a cloud security firm, Wiz found two serious architectural problems with Hugging Face.
The first concern was that someone could upload a malicious AI model, which could be used to gain unauthorized access to other customersβ data.
The second issue that raised eyebrows was that some AI-as-a-service platforms were discovered to have vulnerable container registries. Typically, container registries are utilized for storing and controlling container images. Attackers could tamper with others' models by exploiting vulnerable container registries, potentially inserting harmful code.
Additionally:
- 100 malicious PyTorch and Tensorflow Keras models were identified on Hugging Face by JFrog.
- A PyTorch model named βballer423β was found to be exploiting Pythonβs pickle module to execute code remotely.
With services like Hugging Face, cyberattackers will inevitably jump on the wagon to try to breach security and steal user data. However, the organization is trying its best to improve security and safeguard user data.
Hugging Face has implemented malware scanners and other measure to prevent these attacks. These features scan every file of the repositories for malicious code, unsafe deserialization, or sensitive information, and alert the users or the moderators accordingly.
However, the fact that attackers have managed to get through some security measures in the first place tells us that we should be cautious about downloading models/trusting everything from Hugging Face without proper verification.
π¬ What do you think about Hugging Face? Do you find the platform useful? Do share your thoughts in the comments below.
Author Info
Swayam Sai Das is a student exploring the realms of Linux as an Intern Writer at It's FOSS. He is dedicated, when trying to push ranks in FPS games and enjoys reading literature classics in an attempt of putting on an academic facade.