Tutorial: Getting started with Apache Cassandra

Share
  • September 3, 2019

Applications today create lots of data, and if you want to get value from that data then you have to capture it in the right way. If your application will have to scale up to serve thousands or millions of customers – or if you intend to get a lot of data writes from your devices – then you have to be able to scale up easily. If you are looking for a database with massive scalability and high availability, you need to consider Apache Cassandra™

Firstly, Apache Cassandra has a fully distributed architecture, which makes scaling up very easy over time – you simply add more nodes. Secondly, with Cassandra, your data replicates across data centres and cloud platforms with ease meaning you never need to worry about downtime for maintenance or due to unforeseen circumstances. This also means it is possible to run across multiple cloud services at the same time and avoiding being locked into a specific cloud service. This keeps you in control.

Lastly, choosing Apache Cassandra means you’ll be in great company. Cassandra is currently in use at CERN, Comcast, eBay, GitHub, GoDaddy, Hulu, Instagram, Intuit, Netflix, Reddit, The Weather Channel, and many more companies running active global datasets.

Getting Started with Cassandra

Getting Apache Cassandra up and running involves creating a cluster of Cassandra instances, or nodes. You can then connect to your cluster using any of the drivers for Apache Cassandra™, which come in different languages such as Java, Python, C++, C#, Node.js, Ruby, and PHP.

Below we’ll go through the steps to create a simple Java application using version 3.7.1 of the DataStax Java Driver for Apache Cassandra™. There are API changes for newer versions of the Java driver (4.0+). Please make sure you use the appropriate version for this example.

For this tutorial, we’ll be creating a service for a video recommendation application that will take data and then use it. This app – called KillrVideo – should be a useful starting point for learning more about how Cassandra works and how you can apply this in your own applications. 

It has a three-tiered architecture that is common for cloud-scale applications, with a web application, services tier and database tier. In the full application, it will use a microservice approach with multiple stateless services.

Prerequisites:

  • Set your deployment to use public IPs for your nodes.
  • Download the driver from Github and add to your CLASSPATH, or you can add the following dependency to your Maven POM file:

com.datastax.cassandra

cassandra-driver-core

3.7.1

 

1. Create a Cluster object

Cluster cluster = Cluster.builder().addContactPoint("40.83.177.33").build();

  • The Cluster object is the starting point to connect to a Cassandra cluster, and is created using the Cluster.builder() helper class.
  • Replace the IP address shown in the addContactPoint() method with the public IP of the node in your deployment.

2. Create a Session object

Session session = cluster.connect();

  • This is when the driver makes connections to the cluster nodes.

3. Execute statements using the Session object

session.execute("CREATE KEYSPACE IF NOT EXISTS killrvideo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};");
session.execute("CREATE TABLE IF NOT EXISTS killrvideo.videos (name TEXT, description TEXT, PRIMARY KEY(name))");

session.execute("INSERT INTO killrvideo.videos (name, description) VALUES (?, ?);", "Avengers: Endgame", "No spoilers");

ResultSet rs = session.execute("SELECT * FROM killrvideo.videos WHERE name = ?;", "Avengers: Endgame");

for (Row row : rs) {
System.out.println("Name:" + row.getString("name"));
System.out.println("Description: " + row.getString("description"));
}
  • The execute() method is used to run a CQL statement.
  • The first two statements creates our data model, which is a keyspace killrvideo, and a table videos. Normally you’ll want to create the data model outside of the application, but we do it here to minimize the steps for this example.
  • The third statement inserts a row into the newly created table.
  • The last statement runs a query, which returns a ResultSet object.
  • The ResultSet is an Iterable object and can be processed using a for-loop to iterate through each of the rows in the query results.

The full code

import com.datastax.driver.core.Cluster;
import com.datastax.driver.core.ResultSet;
import com.datastax.driver.core.Row;
import com.datastax.driver.core.Session;

public class Application {

   public static void main(String[] args) {
       Cluster cluster = Cluster.builder().addContactPoint("40.83.177.33").build();
       Session session = cluster.connect();

       session.execute("CREATE KEYSPACE IF NOT EXISTS killrvideo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};");
       session.execute("CREATE TABLE IF NOT EXISTS killrvideo.videos (name TEXT, description TEXT, PRIMARY KEY(name))");

       session.execute("INSERT INTO killrvideo.videos (name, description) VALUES (?, ?);", "Avengers: Endgame", "No spoilers");

       ResultSet rs = session.execute("SELECT * FROM killrvideo.videos WHERE name = ?;", "Avengers: Endgame");

       for (Row row : rs) {
           System.out.println("Name:" + row.getString("name"));
           System.out.println("Description: " + row.getString("description"));
       }

       cluster.close();
   }
}

SEE ALSO: How to perform speculative query executions in Apache Cassandra using GoCql

Run the application

apache cassandra

What’s next?

These are the steps one would use to begin to build an app using Apache Cassandra™. 

For questions about architecture, procedures, and best practices refer to the Apache Cassandra Documentation, or check out any of these additional resources

You can also check out the full GitHub link for KillrVideo here to see how the app is designed: https://killrvideo.github.io/docs/guides/architecture/

The post Tutorial: Getting started with Apache Cassandra appeared first on JAXenter.

Source : JAXenter