BLOOMTranslation site

6mos agoupdate 31 00

BLOOM is an open-source, large-scale, multilingual generative AI model that supports 46 natural languages and 13 programming languages, aiming to promote freedom and openness in AI research.

Location:
France
Language:
ar,bg,de,el,en,es,fr,hi,it,ja
Collection time:
2025-07-26

BLOOM: Initiating Large-Scale...Multilingual open-source AI training modelsThe new eraBLOOM is a generative large-scale language model jointly developed by hundreds of AI researchers worldwide. It features a massive number of parameters, multilingual coverage, and openness, supporting 46 natural languages and 13 programming languages. The release of BLOOM symbolizes freedom and openness in AI research, and its powerful generative capabilities and applicability have garnered significant attention in the industry.

BLOOM's main functions

BLOOM is aAutoregressive Generative Large Language ModelEmploying a transformer architecture, it boasts a staggering 176 billion parameters and supports 46 natural languages and 13 programming languages. This AI training model was developed and trained by the BigScience Workshop on the Jean Zay supercomputer in France, aiming to promote a transparent, reusable, and open AI research ecosystem. Its advantages include:

  • Multilingual supportIt covers English, French, Chinese, Hindi, Arabic, and other languages.
  • Powerful generation capabilitiesIt can generate coherent, human-like text based on user prompts.
  • Downstream task migrationIt is easy to fine-tune for NLP tasks such as summarizing, question answering, translation, and information extraction.
  • Programming language compatibilityIt performs well in mainstream programming languages such as Python, Java, and C++.
  • Fully open source/downloadableAnyone can accessHugging FaceGet it for free and deploy it.
Screenshot from BLOOM's official website
Photo/Screenshot from BLOOM's official website

For example, BLOOM can easily achieve the following functions (Source:Function page link):

Functional typeillustrate
Text generationContinue writing, dialogue, short essay creation
Abstract/Information ExtractionAutomatically generate text summaries and extract key information
Code completionCode completion and generation for multiple programming languages
Semantic understandingIn some formats, it can handle reading comprehension and question answering.
Multilingual translationSupports multilingual translation (not for professional MTs, but can be used for demos and experiments).

BLOOM's Data Diversity Statistics

BLOOM used a highly diverse corpus during its AI training process, specifically including the following table:

BLOOM text generation example
Image/BLOOM text generation example
AI role-playing advertising banner

Chat endlessly with AI characters and start your own story.

Interact with a vast array of 2D and 3D characters and experience truly unlimited AI role-playing dialogue. Join now! New users receive 6000 points upon login!

Language or typeQuantity/Proportion
Natural Language46
programming language13
Preprocessed text size1.6 TB
Number of training tokens350 billion (350B)
Supported maximum text length2048 Token

For more model details, please seeOfficial Hugging Face documentation

Official Q&A page
Image/Official Q&A Page

BLOOM's pricing and plans

As an open-source model, BLOOM's basic model is completely free, and anyone can download and deploy it locally via Hugging Face without paying any licensing fees.

The release of BLOOM follows the BigScience RAIL license, allowing individuals, research institutions, and social groups to use and modify it free of charge, but it must be explicitly prohibited from use in scenarios that violate ethics and laws. If cloud-based inference services, customized APIs, or enterprise-grade deployments are used, Hugging Face may offer separate paid options, but these are value-added services provided by cloud vendors and platform providers and do not conflict with the open-source nature of the BLOOM model itself.

Version/Service CategoryPrice/LicenseAccess methods
BLOOM full modelFree, RAIL open source licenseOfficial website
API Cloud InferenceBased on Hugging Face pricesAPI Page
Local deploymentfreeHardware resources must be provided by the user.

For more pricing and deployment details, please visit [website address].Hugging Face pricing page

Model training data diversity statistics chart
Figure / Statistical chart of diversity in model training data

How to use BLOOM

BLOOM is designed to be "useful out of the box," supporting multi-platform and multi-framework calls. Developers can use it in the following ways:

  1. Direct download weights and tokenizersIt can be loaded and used locally using PyTorch/Transformers.
  2. Direct cloud-based inference via the Hugging Face API (registration and API key required).
  3. It supports fine-tuning/transfer learning to meet specific business needs.

Quick Use Example(See)Official Quick Start Guide):

from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom") model = AutoModelForCausalLM.from_pretrained("bigscience/bloom") prompt = "Please briefly introduce the main functions of the BLOOM model." inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True))

If you only need a small-scale trial, there are also interactive web demos available in Hugging Face Spaces.

Paid plans
Image/Paid Plan

Hardware Requirements Specification Table:

BLOOM parameter scaleBest Hardware Recommendations
176B Parameter Full VersionMultiple A100 GPUs/Enterprise Servers
Lightweight versions such as 7B/3B/1BA single high-end GPU is sufficient

Who is BLOOM suitable for?

BLOOM is positioned as "open source, cutting-edge technology", and is therefore suitable for the following groups:

  • Academic researchers and university faculty and studentsIt can be used for NLP research, result reproduction, model fine-tuning, etc.
  • AI developers and engineersIntegrate into product prototypes and conduct AI capability verification.
  • Multilingual application developers: Serving multinational corporations or linguistically diverse user groups.
  • Data ScientistUsed for specific registers, domain knowledge extraction, and custom tasks.
  • Open source community contributorsThis includes model optimization, evaluation, and development of supporting tools.
  • Programming Education and Automation Tool DevelopersExperiment with AI code generation/completion functionality.

Application Instance Scope Reference Table:

Application typeDemonstrative value
Multilingual document generation/summarizationAutomatic synthesis of multilingual information
Question answering and chatbotsBuild an assistant that supports multiple languages
Code understanding and completionSupport for subject-specific programming assistance
Cross-language content creationGlobal User Content Automation
Low-resource language researchPromoting the protection of language diversity
Official documentation
Photo/Official documentation

For detailed information on suitable users and operating suggestions, please refer to [link/reference].Official documentation

Highlights of BLOOM Technical Architecture and AI Training Model

Model architecture features

  • It adopts a decoder-only structure, similar to GPT-3.However, it covers more languages and has better transfer and generalization capabilities.
  • The number of parameters is up to 176 bytes, and it supports sequence lengths of up to 2048 tokens.The aim is to achieve a wider range of semantic understanding and generation.
Summary of architecture parametersConfiguration/Instructions
number of floors70
Number of attention heads112
Hidden layer dimensions14336
vocabulary size250,680

refer to:More technical details

Diversity and fairness of AI training models

  • Data coverage is extensiveIt includes 46 natural languages, 13 programming languages, and 1.6TB of high-quality text.
  • Strong diversity design principleIt emphasizes proportional sampling of low-resource languages and stresses "open source, openness, and inclusivity".
  • Model versions varyIn addition to the full version with 176B parameters, lightweight versions such as 7B1 and 3B are also available for users with limited resources.

BLOOM's Risks, Limitations, and Usage Recommendations

Limitations and risks must be faced squarely:

  • Non-high-risk decision-making toolsThe model's content "seems reliable, but its accuracy needs to be verified," and it is not suitable for direct decision-making in scenarios such as biomedicine, finance, and law.
  • May output harmful contentSuch as biased, aggressive, or using sensitive words.
  • Strict adherence to ethical and data compliance is required.: Comply with the RAIL protocol and do not abuse it.
Main risk typesDetailed Explanation
Bias/Data ImbalanceInformation about certain groups appears at different frequencies
Personal information leakageThe training data may contain sensitive content.
Error message generatedThe generated content is not a 100% fact.
Inappropriate useAutomatic evaluation of individuals and key decision-making scenarios are prohibited.
Risk and Limitations Statement Document
Photo/Risk and Limitations Statement Document

See details:Risk and Limitations Statement Document

BLOOM Frequently Asked Questions

What are the different versions of the BLOOM model, and how do you choose between them?

BLOOM offers multiple parameter levels, ranging from micro-scale (bloom-560m) to ultra-large scale (bloom-176B).

  • With limited hardware resources, we recommend choosing lightweight versions such as 7B or 3B.
  • For research and high-performance requirements, the full version of 176B can be used, but it requires distributed multi-GPU deployment.

For a detailed version list, please seeBLOOM Model List

Can bloom be used in commercial products?

Under the open-source RAIL license, BLOOM can generally be used for commercial applications (as long as it is not illegal or used in high-risk/violation scenarios), but it is recommended to read the license agreement carefully to ensure that you do not violate any additional terms. For commercial calls to cloud APIs, additional fees will apply according to the Hugging Face platform's terms.

Can BLOOM be customized and fine-tuned? Is it useful for using custom data?

BLOOM is designed forTransferable/Fine-tunableThe AI training model has been developed, and the development team and community have provided various practical solutions for fine-tuning. Based on the publicly available Transformers toolkit, developers can quickly adapt BLOOM to their own datasets for downstream tasks such as classification, annotation, and generation.

Fine-tuning tutorial/practical guide: (For reference)Official documentationAnd community sharing.

End

BLOOM has become a milestone in promoting the democratization of NLP and open collaboration in AI. Its multilingual capabilities and open ecosystem have created unprecedented fertile ground for innovation for global developers and AI training model enthusiasts. Whether it's scientific research experiments, language diversity protection, or intelligent product prototype development, BLOOM provides you with a flexible, professional, open, and high-performance new paradigm for AI. If you are interested in experiencing the power of cutting-edge AI, visit [website name] now.BLOOM Official DocumentationEmbark on your exploration journey and work together to promote the prosperity and development of AI technology.

AI role-playing advertising banner

Chat endlessly with AI characters and start your own story.

Interact with a vast array of 2D and 3D characters and experience truly unlimited AI role-playing dialogue. Join now! New users receive 6000 points upon login!

data statistics

Data evaluation

BLOOMThe number of visitors has reached 31. If you need to check the site's ranking information, you can click "..."."5118 Data""Aizhan Data""Chinaz data""Based on current website data, we recommend using Aizhan data as a reference. More website value assessment factors include:"BLOOMAccess speed, search engine indexing and volume, user experience, etc.; of course, to evaluate the value of a website, the most important thing is to base it on your own needs and requirements, and some specific data will need to be obtained from [research institutions/resources].BLOOMWe will negotiate with the website owner to provide information such as the website's IP addresses, page views (PV), and bounce rate.

aboutBLOOMSpecial Announcement

This site's AI-powered navigation is provided by Miao.BLOOMAll external links originate from the internet, and their accuracy and completeness are not guaranteed. Furthermore, AI Miao Navigation does not have actual control over the content of these external links. As of 12:02 PM on July 26, 2025, the content on this webpage was compliant and legal. If any content on the webpage becomes illegal in the future, please contact the website administrator directly for deletion. AI Miao Navigation assumes no responsibility.

Relevant Navigation

No comments

none
No comments...