DataComp-LM: In Search of the Next Generation of Training Sets for Language Models

We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with data curation strategies such as deduplication, filtering, and data mixing at model scales ranging from 412M to 7B parameters. As a baseline for DCLM, we conduct extensive experiments and find thatâ€¦

Source: Read MoreÂ

Having trouble getting WCF-based traffic to show up in JMeter recorder

November 15, 2024

I’m running into an issue with getting the traffic from a WCF-based Windows desktop client to show up in the JMeter recorder – and I’m hoping someone out there might be able to point me in the right direction.

Complicating things a bit, I’m not working directly with the developers of the system, but rather for one of their customers – which somewhat limits my access to some of the information and expertise I would otherwise expect to have. The client runs over WCF with a binding to an HTTPS endpoint on Azure.

I have JMeter running on the same box as the client. I have the JMeter Proxy Server working and the ApacheJMeterTemporaryRootCA.crt in place and working. The client doesn’t have it’s own proxy settings, but does appear to be honoring useDefaultWebProxy=”true” (as indicated by the client failing with SSL security warnings when the temp cert isn’t in place). However, when I exercise the client with the HTTP(S) Test Script Recorder running, none of the traffic to and from the client ever shows up in the recorder. When I use a browser under the same setup, the traffic is getting recorded as expected.

Being able to get the recorder working is vital because I don’t have access to enough information about the client’s protocols to try building out tests for them manually (and I suspect it may prove outside of my skillset, given my very limited experience with WCF).

Any useful solutions, suggestions, tips, or pointers would be GREATLY appreciated!

(I’m posting this question in XXXXX to maximize my chances of getting the right answer quickly. If this type of cross-posting is frowned upon, I apologize for any inconvenience.)

Best,

Justin

“If you can’t tell me what you tested, you might as well have not tested.”

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

DataComp-LM: In Search of the Next Generation of Training Sets for Language Models

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

Finding modules in a big ball of mud

Create Portrait Mode Effect with Segment Anything Model 2 (SAM2)

Fix: ERROR_DBG_RIPEXCEPTION 695 (0x2B7)

Understanding Encryption Algorithms: AES, DES, and Blowfish

Having trouble getting WCF-based traffic to show up in JMeter recorder

One of the best budget Android phones I’ve tested just got a flashy successor

Forget the iPad: This is a great tablet for kids, and it’s on sale ahead of Prime Day

Nanonets announces strategic partnership with Credex Technology

DataComp-LM: In Search of the Next Generation of Training Sets for Language Models

Related Posts