Cassandra: The Definitive Guide
Download - Cassandra: The Definitive Guide by Eben Hewitt - PDF
Why Apache Cassandra?
Apache Cassandra is a free, open source, distributed data storage system that differs sharply from relational database management systems.
Cassandra first started as an incubation project at Apache in January of 2009. Shortly thereafter, the committers, led by Apache Cassandra Project Chair Jonathan Ellis, released version 0.3 of Cassandra, and have steadily made minor releases since that time. Though as of this writing it has not yet reached a 1.0 release, Cassandra is being used in production by some of the biggest properties on the Web, including Facebook, Twitter, Cisco, Rackspace, Digg, Cloudkick, Reddit, and more.
Cassandra has become so popular because of its outstanding technical features. It is durable, seamlessly scalable, and tuneably consistent. It performs blazingly fast writes, can store hundreds of terabytes of data, and is decentralized and symmetrical so there’s no single point of failure. It is highly available and offers a schema-free data model.
Is This Book for You?
This book is intended for a variety of audiences. It should be useful to you if you are:
- A developer working with large-scale, high-volume websites, such as Web 2.0 social applications
- An application architect or data architect who needs to understand the available options for high-performance, decentralized, elastic data stores
- A database administrator or database developer currently working with standard relational database systems who needs to understand how to implement a faulttolerant, eventually consistent data store
- A manager who wants to understand the advantages (and disadvantages) of Cassandra and related columnar databases to help make decisions about technology strategy
- A student, analyst, or researcher who is designing a project related to Cassandra or other non-relational data store options
This book is a technical guide. In many ways, Cassandra represents a new way of thinking about data. Many developers who gained their professional chops in the last 15–20 years have become well-versed in thinking about data in purely relational or object-oriented terms. Cassandra’s data model is very different and can be difficult to wrap your mind around at first, especially for those of us with entrenched ideas about what a database is (and should be).
Using Cassandra does not mean that you have to be a Java developer. However, Cassandra is written in Java, so if you’re going to dive into the source code, a solid understanding of Java is crucial. Although it’s not strictly necessary to know Java, it can help you to better understand exceptions, how to build the source code, and how to use some of the popular clients. Many of the examples in this book are in Java. But because of the interface used to access Cassandra, you can use Cassandra from a wide variety of languages, including C#, Scala, Python, and Ruby.
Finally, it is assumed that you have a good understanding of how the Web works, can use an integrated development environment (IDE), and are somewhat familiar with the typical concerns of data-driven applications. You might be a well-seasoned developer or administrator but still, on occasion, encounter tools used in the Cassandra world that you’re not familiar with. For example, Apache Ivy is used to build Cassandra, and a popular client (Hector) is available via Git. In cases where I speculate that you’ll need to do a little setup of your own in order to work with the examples, I try to support that.
- CC BY-NC-SA 3.0 PH
- The author's reference is not required