[Atelier] Le traitement des flux en toute simplicité avec Flink | S'inscrire
Is Windows your favorite development environment? Do you want to run Apache Kafka® on Windows? Thanks to the Windows Subsystem for Linux 2 (WSL 2), now you can, and with fewer tears than in the past. Windows still isn’t the recommended platform for running Kafka with production workloads, but for trying out Kafka, it works just fine.
Let’s take a look at how it’s done:
The Windows Subsystem for Linux (WSL 2) makes it all possible. Microsoft describes WSL as “a GNU/Linux environment—including most command line tools, utilities, and applications—directly on Windows, unmodified, without the overhead of a traditional virtual machine or dual boot setup.”
If you already have WSL 2 installed, skip to Install Java.
Make sure you’re running Windows 10 version 21H1 or later, or Windows 11 21H2 or later. To check your version, navigate to Settings > System > About. In the Windows specifications section, find Version. Ensure that you have all updates to your version installed.
If you’re on the Windows Update train, you probably have the latest version and are good to go. If not, you need to update Windows.
When you’re sure that Windows is up to date, follow these instructions to install WSL 2.
To install the WSL 2 feature, open PowerShell as an administrator, and run the following command:
This may take a minute or two. Your output should resemble the following:
This command installs the Microsoft Store version of WSL and automatically selects the WSL 2 version. Also, it installs the default Linux distro, which is Ubuntu 22.04 as of this writing.
Tip: If you already have the non-Store version of WSL installed, you can run the wsl –update
command to get it.
Reboot your machine.
After the reboot completes, log in. The installation of the default Linux distro continues automatically. The shell terminal opens and displays the following message:
Enter a username and password to complete the installation. Save them in a secure location, because you will need them when you work with the shell later.
WSL 2 is ready to use. For more information on installing WSL 2, including troubleshooting, see Install Linux on Windows with WSL. For more information on WSL commands, see Basic commands for WSL.
Kafka is built with Java and requires the Java runtime to execute. You can use the apt-get
package manager to install the latest updates. In the Ubuntu shell window that opened above, run the following commands to install the latest versions of various libraries:
Tip: Right-click pastes text into the terminal window.
It may take a few minutes to download and install all of the most recent binaries. Once Ubuntu is updated, you can install Java.
Kafka requires the Java runtime version to be 8, 11, or 17. Java 8 is deprecated, so Java 11 and Java 17 are preferred. Check the Java version in your Linux installation:
Your output should resemble this:
If Java isn’t installed (likely) or it’s not the right version, install it by using your distribution’s package manager. There are a lot of ways to install Java. On Ubuntu, this is one of the simplest:
You can install Kafka by using a package manager, or you can download the tarball and extract it to your local machine directly.
Download the tarball from the Apache Kafka download site. The following command downloads Apache Kafka version 3.5:
Run the following commands to untar the Kafka archive, and cd
to the kafka
directory:
Run the ls -al
command to list the contents of the kafka
directory:
Run the kafka-storage.sh
script to generate a cluster ID:
Run the kafka-storage.sh
script again to format the log directories:
Your output should resemble:
Run the kafka-server-start.sh
script to start the Kafka server:
There will be a lot of output, and Kafka Server will be ready in a short time, typically around a second or two.
Your output should resemble the following screenshot:
In this step, you open two terminal windows, one to run a producer and another to run a consumer.
Open another terminal session and run the kafka-topics
command to create a Kafka topic named demo-messages
:
Run the kafka-console-producer
command to start producing events to the topic:
When you’re prompted, type a few lines of text to produce some events:
Open another terminal session and run the following command to start consuming events:
Your output should resemble this:
Arrange the producer and consumer terminal windows to be side by side. In the producer terminal, type a few more messages, and watch as they appear in the consumer terminal.
When you’re done experimenting with Kafka, follow these steps to exit the Kafka environment:
Stop the consumer and producer clients with Ctrl+C
Stop the Kafka Server with Ctrl+C
Run the following command to clean up:
There are lots of Kafka-on-Windows tutorials, but most make the mistake of running Kafka directly on the JVM on Windows. Superficially, this appears to work, but there are limitations: Kafka uses specific features of POSIX to achieve high performance, so emulations—which happen on WSL 1—are insufficient. For example, the broker will crash when it rolls a segment file. Always run Kafka on Windows in a Linux environment backed by WSL 2.
Another approach that works well is to run Kafka in Docker containers. Docker Desktop for Windows has been updated to use the WSL 2 back end, so Docker works exactly as it does on native Linux, without needing to spin up an entire VM. If you want to give this approach a go, try it out using the Confluent Platform demo.
Although Kafka provides an event streaming platform to build your applications on, you’ll want to take advantage of the broader ecosystem of components—like ksqlDB, Confluent Schema Registry, and Confluent Control Center—all provided as part of Confluent Platform. At the moment, Confluent Platform is supported for experimentation only on Windows, not for production or development environments.
Now that you have Kafka installed, you’ll want to learn more about it, try out the numerous tutorials, and join the community! Don’t forget that Apache Kafka has many APIs—including the producer and consumer but also Kafka Streams and Kafka Connect.
You may recall a time when Linux was anathema to Microsoft. Back in 2001, Microsoft CEO Steve Ballmer famously called Linux a “malignant cancer,” but he has since come around to “loving” it. Microsoft’s current CEO Satya Nadella seems intent on making it a first-class citizen in the Microsoft ecosystem, which means that a new era has arrived for software developers on the Windows platform.
When the Windows Subsystem for Linux (WSL 1) was released in 2016, it became possible to run a real Linux dev environment in a Linux shell, while retaining the familiar Windows UX around the shell. Even File Explorer was integrated nicely with the Linux file system.
The big drawbacks are that WSL 1 emulates a Linux kernel, and it runs in a full VM. The first means processes that require a native kernel, like Docker, can’t run. The second means that WSL 1 consumes a lot of resources. WSL 1 was not sufficient to run Kafka reliably.
But Microsoft delivered WSL 2 in 2019, and it’s a whole new world. They fixed the two biggest limitations, so WSL 2 runs a real Linux kernel, and the kernel runs on a subset of Hyper-V features, not in a full VM. For details, see Comparing WSL 1 and WSL 2. Now the path is clear for devs to build Kafka and Kafka Streams apps on Windows.
✍️ Editor's note: This blog post was originally published by Jim Galasyn on December 9, 2020. As of September 7, 2023, the content has been updated with the latest information and best practices.
Learn when to consider expanding to multiple Apache Kafka clusters, how to manage the operations for large clusters, and tools and resources for efficient operations.
When developing streaming applications, one crucial aspect that often goes unnoticed is the default partitioning behavior of Java and non-Java producers. This disparity can result in data mismatches and inconsistencies, posing challenges for developers.