OpenAI releases 1 million token coding model GPT 4.1, available immediately via API
The post OpenAI releases 1 million token coding model GPT 4.1, available immediately via API appeared on BitcoinEthereumNews.com.
OpenAI has released GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano to its API suite, phasing out GPT-4.5 Preview while advancing code generation, instruction following, and long-context processing capabilities. Essentially signaling the failure of GPT-4.5, the new 4.1 models introduce context windows of up to one million tokens, enabling native handling of full repositories, extensive documents, and complex multi-turn agent workflows within a single call. While researching this article, I was able to use GPT-4.1 ‘vibe code,’ a simple Python-based dungeon crawler, in 5 minutes and 5 prompts. The model made no errors in its code, with the only issues related to identifying relevant sprites in the asset atlas I imported. Dungeon crawler demo built with GPT-4.1 Due to its large context window, it was also able to successfully identify the functionality of a large code repo within a few prompts. Model Capabilities and Transition Path Per OpenAI, GPT-4.1 achieves a 54.6% score on SWE-bench Verified, reflecting the improved ability to produce runnable code patches that resolve real-world repository issues. This outpaces GPT-4o’s 33.2% and GPT-4.5’s 38% under the same benchmark. The model also executes code diffs more precisely, with 53% accuracy on Aider’s polyglot benchmark in diff format, more than doubling GPT-4o’s 18%. Instruction-following fidelity is also refined. On Scale’s MultiChallenge, GPT-4.1 reaches 38.3% accuracy, compared to 27.8% for GPT-4o. These improvements include adhering to strict output formats, complying with constraints, and following nested or contradictory instructions. According to the AI coding platform Windsurf, internal evaluations show that GPT-4.1 produces cleaner diffs and is more aligned with structured developer workflows. The models’ ability to process long contexts includes 1 million token support, surpassing the previous 128K token window. To validate this, OpenAI released MRCR, an open-source evaluation that tests a model’s ability to retrieve specific details from within dense, distractor-heavy context…
Filed under: News - @ April 15, 2025 9:28 am