Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Take part in our daily and weekly newsletters to get the latest updates and exclusive content for reporting on industry -leading AI. Learn more
Microsoft Has introduced a new class of highly efficient AI models that process text, images and language at the same time and at the same time need considerably less computing power than existing systems. The new Phi-4 modelsPublished today, a breakthrough in the development of SLMS (small language models) that provide skills that are previously reserved for much larger AI systems.
Phi-4-multimodalA model with only 5.6 billion parameters and PH-4 miniWith 3.8 billion parameters, competitors in a similar size and match or even exceed the performance of models that are twice as large for certain tasks Technical report.
“These models are intended to enable developers of advanced AI functions,” said Weishu Chen, Vice President of Generative AI at Microsoft. “With its ability to process language, vision and text at the same time, Phi-4-Multimodal opens up new opportunities for the creation of innovative and context-related applications.”
The technical performance takes place at a time when companies are increasingly looking for AI models that can be carried out on standard hardware or on “.edge” -Directly on devices and not in cloud calculation centers -to reduce costs and latency and at the same time maintain data protection.
What sets Phi-4-multimodal Apart from that, his novel is “Mixing of Loras“Technology with which you can process text, images and language inputs in a single model.
“By using the mixture of Loras, the phi-4 multimodal extends the multimodal skills and at the same time minimizes the interference between modalities”, the Research paper States. “This approach enables seamless integration and ensures consistent performance across the tasks that contain text, images and language/audio.”
The innovation enables the model to maintain its strong voice functions and at the same time to add the vision and speech recognition without reducing performance, which often occurs when models are adapted for multiple input types.
The model has the top position on the claim Hugging face openasr ranking With a word error rate of 6.14%, special speech recognition systems such as Whisper. It also shows a competitive service for vision tasks such as mathematical and scientific justification with pictures.
Despite its compact size, PH-4 mini shows extraordinary functions in text -based tasks. Microsoft reports that the model “exceeds models similar sizes and exceeds with models with twice larger models via various real -understandable benchmarks.
The performance of the model for mathematics and coding tasks is particularly remarkable. After Research paper“Phi-4-mini consists of 32 transformer layers with a hidden state size of 3,072” and takes into account the attention of the group queries in order to optimize storage use for long contexts.
On the GSM-8K mathematics benchmarkPhi-4-mini achieved a score of 88.6% and exceeded most of the 8 billion parameter models, while reaching 64% on the mathematics benchmark, much higher than with competitors in a similar size.
“For the math -benchmark, the model exceeds similar models with large edges, sometimes more than 20 points. It even exceeds two -time ratings of the models, ”says the technical report.
capacityA KI response engine that helps companies have already used various data records to combine the Phi family to improve the efficiency and accuracy of their platform.
Steve Frederickson, product manager in capacity, said in A opinion“From our first experiments, the remarkable accuracy and simple provision were really impressed even before adjusting. Since then, we have been able to improve both the accuracy and reliability and at the same time maintain cost efficiency and scalability, which we estimated from the start. “
The capacity reported a cost savings of 4.2 times compared to competing workflows and at the same time achieved the same or better qualitative results for the pre -processing tasks.
The AI development has been driven by unique philosophy for years: bigger is better. Other parameters, larger models, larger arithmetic requirements. But the Phi-4 models from Microsoft question this assumption in order to prove that electricity is not just about scaling, but about efficiency.
Phi-4-multimodal And PH-4 mini are not designed for the data centers of tech giants, but for the real world – where the calculation of electricity is limited, the concerns about privacy are of the greatest importance and Ki has to work seamlessly without constant connection to the cloud. These models are small, but they wear weight. Phi-4-multimodal integrated language, visual and text processing into a single system without affecting the accuracy, while phi-4-mini mathematics, coding and argumentation with models with the size of twice as large.
It’s not just about making AI more efficient. It’s about making it more accessible. Microsoft has positioned Phi-4 for the widespread acceptance, which it makes through available Azure Ai FoundryPresent Hugand the NVIDIA API catalog. The goal is clear: AI that is not closed behind expensive hardware or massive infrastructure, but one that can work on standard devices, on the edge of networks and in industries in which computing power is scarce.
Masaya Nishimaki, director of the Japanese KI company Headwaters Co., Ltd., sees the effects first -hand. “Edge Ai shows an outstanding performance even in environments with unstable network connections or where confidentiality is of the greatest importance,” he said in one opinion. This means AI that can work in factories, hospitals and autonomous vehicles that can work in which real-time intelligence is required in which conventional cloud-based models are neglected.
In essence, Phi-4 is a shift in the thinking. Ki is not just a tool for those with the biggest servers and the deepest pockets. It is an ability that, when it is well designed, can work anywhere for everyone. The most revolutionary thing about Phi-4 is not what it can do-is where it can.