Markov AI released Computer Use Large on Hugging Face with 48,478 screen recordings spanning about 12,300 hours across six professional apps. Use it to train and evaluate GUI agents on real software workflows with a large CC-BY dataset.

Computer Use Large is a new Hugging Face dataset for desktop-agent work, built from 48,478 screen recordings of professional software use and released under CC-BY-4.0 launch post. The Hugging Face listing at the dataset page positions it for “training & evaluating computer use agents,” not just passive video understanding, which matters because the source material is grounded in GUI workflows rather than synthetic trajectories dataset page.
The current coverage spans six applications: AutoCAD, Blender, Excel, Photoshop, Salesforce, and VS Code app coverage. That gives the corpus a mix of office, creative, CAD, CRM, and coding environments, which is broader than single-app desktop datasets and more relevant for benchmarking cross-domain computer-use behavior.
The dataset page says the videos were sourced from YouTube tutorials and then processed to keep only screen-centric segments: audio was stripped with ffmpeg, intros and outros were removed, frames were sampled every 10 seconds, and a vision-language model, Gemini Flash, was used to classify whether frames were genuine screen-recording content processing details. Videos with less than 10 seconds of screen activity were discarded, and the metadata includes fields such as original and trimmed duration, upload date, screen-content percentage, and segment counts metadata fields.
For engineers, the practical value is less about a new model release and more about data availability. A large, openly licensed corpus with per-video metadata and category splits can support pretraining, eval set construction, and comparisons across app domains using Hugging Face’s loading flow load_dataset API. The supporting repost reinforces the scale claim, calling it the “world’s largest open-source dataset of computer-use recordings” and highlighting 10,000-plus hours across enterprise and productivity software supporting repost.
Claude can now drive macOS apps, browser tabs, the keyboard, and the mouse from Claude Cowork and Claude Code, with permission prompts when it needs direct screen access. That makes legacy desktop workflows automatable, and Anthropic is pairing the push with more background-task support for longer agent loops.
releaseOpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.
releaseCursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
breakingChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
breakingEpoch AI says GPT-5.4 Pro elicited a publishable solution to one 2019 conjecture in its FrontierMath Open Problems set, with a formal writeup planned. Treat it as an early milestone worth reproducing, not blanket evidence that frontier models can already automate math research.
World's largest open-source dataset of computer-use recordings just dropped on Huggingface, for training & evaluating computer use agents. 48,478 screen recording videos (~12,300 hours) of professional software being used. License - CC-BY-4.0