Background

After migrating my blog, I wasn’t entirely satisfied with Copilot Pro, especially in debugging, handling details, and advancing complex tasks, which often required repeatedly supplementing context. So I subscribed to Codex, wanting to try it in scenarios closer to real work: including code generation, debugging, project modifications, as well as daily tasks like documentation, PPTs, and email summaries.

The installation process of Codex is relatively simple. After downloading the installer from the official website, the subsequent installation redirects to the Microsoft Store. The process itself is not difficult, but the experience is a bit roundabout: the entry point is on the official website, but the actual installation and updates rely on the Microsoft Store. Perhaps this is to comply with Windows security mechanisms, while also optimizing vendor CDN download costs and quick deployment experience, but it also leads to less smooth update experiences. Currently, many Windows app installations follow this trend.

Regarding version updates, Codex currently has a high iteration frequency, but the Microsoft Store updates are not always timely. Sometimes the official version has been released, but the client does not automatically update promptly, causing some new features to be unavailable locally, or the version being too old to function properly. In such cases, manual update checks are needed, which disrupts the workflow.

For example, with the recently launched Chrome extension for Codex, I encountered a situation where the official release had been out for over ten hours, but the Microsoft Store still hadn’t updated the downloadable Codex client to a usable version. In contrast, the Codex extension in the Chrome Web Store updates faster, and it can be installed and used as soon as it’s listed.

However, the Codex Chrome extension was not stable enough when first released. It needs to establish a connection with the local Google Chrome Codex extension client, and unstable connections affect actual usage. For instance, when I tried to read and summarize Gmail emails via the Codex Chrome extension, I encountered connection failures and couldn’t retrieve email content. If the main need is to handle Gmail, I prefer to use the Codex Gmail extension directly. It reads content by linking email accounts, offering better stability than relying on the browser extension chain.

Experience Comparison

In terms of code generation, Codex feels smoother than Copilot Pro. By “smoother,” I don’t mean it always gives a perfect answer on the first try, but it acts more like an assistant that can continuously advance tasks: it understands longer contexts and makes it easier to break down, modify, and verify tasks.

Copilot Pro feels more like a code assistant embedded in the editor. It is valuable for autocompletion, local generation, and simple debugging, but when faced with complex requirements, it often requires me to repeatedly explain the background, correct directions, and add constraints. Many times, it’s not that it can’t do it, but the cost of advancing is relatively high.

Codex’s advantages mainly come from model capabilities and task forms. It can directly use the newer ChatGPT models, performing more stably in understanding complex requirements, generating code, modifying projects, and explaining issues. Previously, with Copilot Pro, some requirements required multiple rounds to get close to expectations; after switching to Codex, although iteration is still needed, ineffective communication has significantly decreased.

Another major difference is the task boundary. Copilot Pro is more focused on code generation and debugging, while Codex is more like a complete AI workbench. Besides writing code, it can handle tasks like Markdown, PPTs, translation, data analysis, and automation workflows. In some scenarios, it can even replace some automation tools without needing to separately handle API model selection, subscription configuration, and security access.

Of course, this doesn’t mean Codex can “generate final results with one click.” Whether it’s code functionality, page layout, document structure, or security, performance, and architecture considerations, iteration is needed based on the generated results. The difference is that Codex’s iteration efficiency is higher, and the modification path is more coherent. After using it for a while, I’ve started considering canceling my Copilot Pro subscription—it’s hard to go back. At the same time, Codex is also much better than OpenCode.

Codex is not without flaws. In certain specific technical areas, the code it generates may still be inaccurate, requiring manual verification and adjustment; the subscription fee is also not low, which is a practical factor for many. However, if it can truly reduce trial-and-error costs and shorten the distance from idea to result, then this money is not just for buying a tool, but for buying a new way of working.

From Ambiguity to Executability

In my experience with Codex, what impressed me most is not “how much code it can write,” but its ability to handle vague requirements.

For example, I wanted to create a PPT about AI infrastructure. If I directly said “Help me generate a PPT on AI infra,” Codex would usually first provide an outline or even generate a batch of page content. But this result is likely just “usable as a starting point” and may not fit the actual presentation scenario: who the audience is, what the presentation goal is, how deep the content should be, whether the visual style is technical or management-oriented, which terms must appear, and which content cannot be expanded—all these affect the final quality.

If I first clarify the requirements and constraints in planning mode, the effect is significantly better. For example, first specify the PPT’s audience, presentation goal, content boundaries, expression style, visual direction, and expected number of pages, then let Codex design the structure based on these constraints. The subsequent generated content will be closer to a usable draft.

This is where I find planning mode very valuable. It doesn’t solve the problem of “completely not knowing what you want,” but rather the problem of “direction is clear, but details haven’t been fleshed out.” Many things are difficult because they start from scratch: you know you want a result, but you don’t know how to break down the first step, what structure to adopt, or what details will arise along the way.

Codex provides a good starting point here. It can first give a framework, then accompany you in gradually refining it. This applies to PPTs as well as code features. Many details of a feature are not thought of before actually starting; they only emerge during implementation. Codex can quickly generate test case frameworks, task orchestration YAMLs, initial pages, or script templates, and what I need to do is judge, make trade-offs, correct, and advance based on this foundation.

Compared to writing entirely from scratch, this approach greatly reduces the startup resistance and also reduces much of the uncertainty in preliminary research.

A Little Observation

The biggest change Codex has brought me is that it redefines the cost of “starting something.”

In the past, many things were not impossible to do, but the startup cost was too high: you needed to look up information, set up the environment, write templates, try frameworks, and fill in details. These steps still exist now, but a large part of them can be rolled out much faster. What humans need to do has also shifted from pure execution to more of setting goals, judging direction, controlling quality, and finishing up. In this process, standardized time expenditure drops significantly, while more energy is invested in subjective analysis and decision-making, leading to a qualitative leap in work efficiency. What AI tools truly change is not just which tasks they can complete for us, but that they lower the cost of turning our ideas into results.

This way of working will continue to blur the boundaries between many different roles. It makes it easier for people to cross into unfamiliar fields and attempt things they previously dared not try. For example, full-stack development, automated analysis, lightweight design, and content production—tasks that were originally distributed across different roles—are now more easily strung together by one person.

Future job positions may become increasingly goal-oriented rather than strictly divided by fixed functions. What a person is responsible for may no longer be just a small piece of duty under a specific title, but a result that can be delivered, verified, and iterated upon.

This is already happening. What we need to adapt to is not just a single tool, but this new way of collaboration.