dbShards
Useful Links


Previous Posts

Archives

Categories

Syndication
Contact Us

And get a unique database solution that grows when you grow and never costs more than it should.

Featured product

 

Use our no-charge dbShards/Analyze driver to identify critical performance issues in your database tier. A dbShards consultant will then coordinate with you to identify key hotspots, helping you to develop a plan for optimizing your database.

DBMS2 Talks About Database Scale-out

Monday, November 7, 2011

“There’s a perception that, if you want (relatively) worry-free database scale-out, you need a non-relational/NoSQL strategy. That perception is false. In the analytic case it’s completely ridiculous, as has been demonstrated by Teradata, Vertica, Netezza, and various other MPP (Massively Parallel Processing) analytic DBMS vendors. And now it’s false for short-request/OLTP (OnLine Transaction Processing) use cases as well.”

See full article here.

Posted in News & Commentary by admin - No Comments »

dbShards CEO/CTO, Cory Isaacson, will be giving a webinar about scaling your SQL and NoSQL databases in the cloud. Sponsored by RightScale, the webinar will be held on the 21st of July 2011, at 2pm ET.

As the demand for intensive, large database-driven applications is on the rise, it is important to be proactive in your database architecture. We will help to give you the information you need to properly design, manage and scale your database with your application growth.

Reserve your spot now!

Posted in News & Commentary by choisington - No Comments »

RightScale User Conference NYC 2011

Thursday, June 30, 2011

Another productive and informative user conference from the RightScale team! See full article here.

RightScale User Conference NYC 2011 from RightScale on Vimeo.

Posted in News & Commentary by choisington - No Comments »

Our CEO, Cory Isaacson, sits down with Jeremy Geelan to talk about dbShards.

dbShards Talks to SYS-CON.TV at Cloud Expo New York — Cory Issacson, CEO at CodeFutures, speaks with Cloud Expo Conference Chair Jeremy Geelan on SYS-CON.tv at Cloud Expo New York, held June 6-9, 2011 at the Jacob Javits Convention Center, New York, NY.

Posted in News & Commentary by choisington - No Comments »

dbShards 2011 Cloud Expo New York

Wednesday, June 1, 2011

We are proud to announce dbShards’ involvement in the 8th International Cloud Expo at New York City’s Javits Convention Center. During the week of June 6th dbShards’ CEO, Cory Isaacson, will have multiple speaking events covering the benefits of database sharding, emphasizing the use in the Cloud.

Schedule:
Wednesday June 8, 2011
CEO Pannel
Ask all your dbShards’ questions straight at our CEO, Cory Isaacson.

Scaling your database in the cloud
A full presentation of dbShards with a demo to show the power and benefits of database sharding.

dbShards will also be attending RightScale’s user conference. RightScale’s User Conference will be hosted at the Javits Convention Center inside of the Cloud Expo 2011. Maximize your opportunity to share and learn about the cloud by attending both events.

Database scalability is integral to a company’s growth an success in the cloud and dbShards in leading the way with high performance and high availability solutions. While at the Cloud Expo please visit our booth at location A1 for more information. And, as always, email us with any further questions.

Posted in News & Commentary by admin - No Comments »

Dan Kusnetzky recently released an article about a dbShards customer, Family Builder. This article details the benefits of using dbShards with your application.

“I like to hear from organizations using products not just the suppliers of those products. I believe that gives a more complete view of a product than a simple conversation with the supplier could offer. David Blinder, CTO of Familybuilder, let me know a bit about his experiences with codeFutures’ dbShards (see codeFutures dbShards for more information about the supplier.) Thanks for taking the time to communicate with me, David.

Please introduce yourself and your organization:

My name is David Blinder and I am the CTO of Familybuilder. Our company develops family-oriented social network applications. Our flagship application is Family Tree on Facebook, boasting over 35 million users with over 5 million monthly active uniques. The application is very user centric and data intensive, housing information on family relations, both in network and out. Family Tree has been the top application for families in the social space for several years. Our company is considered a seasoned start-up, at this point, with ongoing and far reaching potential.

Family Tree started out as is typically the case – in a conventional hosting environment. Initial issues with a code base strewn with inefficiencies plagued our growth but were remedied and re-factored early in our history. The insult came when we hit a wall with our conventional hosting choice and its restrictive downsides. Calls to NOCs at all hours to get updates on machine builds as well as restarts and a world of other communication issues led us to investigate the growth in cloud computing. We ported our applications over to Amazon’s cloud and leveraged EC2 and S3 immediately to support our rapid growth.

What were you doing that needed this type of technology?

Our application was a great fit in the social network space. Users seeking to create subsets of friends into more meaningful classes of relationships found our application appealing for family interaction/communication and suggested it to other relatives.”

See full article here.

Posted in News & Commentary by admin - No Comments »

Curt Monash clarified some common questions about dbShards:

After I posted recently about dbShards, a Very Smart Commenter emailed me with the challenge “but each individual shard is still replicated via two-phase commit, and everybody knows two-phase commit is fundamentally slow.” I replied that no, it wasn’t exactly two-phase commit, but fumbled the explanation of why — so I decided to escalate straight to dbShards honcho Cory Isaacson. Cory’s clarification, lightly edited as per his permission, was:

See full article here.

Posted in News & Commentary by choisington - No Comments »

ZDNet and dbShards

Tuesday, February 8, 2011

Dan Kusnetzky recently released an article detailing the structure of dbShards and how it works to scale your application:

Cory Isaacson, CEO, of codeFutures, and I spoke about how today’s distributed applications often are more highly scalable than the database infrastructure they rely on. His company, codeFutures, would propose turning to a highly scalable database infrastructure rather than relying on older, more centralized database engines. He also points out that the centralized approach to data management might limit cloud computing application performance as well.

codeFutures makes its case

Cory would tick off the following points to make his case: ……

See full article here.

Posted in News & Commentary by choisington - No Comments »

There has been a lot of interest lately in NoSQL databases and, of course, many of us have strong backgrounds and experience in traditional relational “SQL” databases. For application developers this raises questions concerning the best way to go.

One recurring truth that eventually surfaces with all new software technologies is that “one size does not fit all.” In other words, you need to use the right tool for the job, as each has its own strengths and weaknesses. In fact, a danger of many new architectural approaches is one of “over-adoption” – using a given tool to address a wide array of situations when originally they were designed for the specific problem domain in which they excel.

Therefore, the right answer to the question of whether to use a “SQL or NoSQL?” database is: “it depends.” The best solution for your application may be a traditional SQL database, a NoSQL database, or possibly a mix of both. Each technology has its own areas of use, and the best recommendation is to investigate specific products to meet your specific needs. It’s also important to consider your existing investment in what you have that is functional and proven, working out ways to preserve that investment while extending into new ways of doing things for improved application performance and capabilities.

To read our full article published in Database Trends and Applications, please visit the link below:

http://www.dbta.com/Articles/Editorial/Trends-and-Applications/SQL-or-NoSQL3f-How-to-Choose-the-Right-Database-for-Your-Application-71240.aspx

Posted in News & Commentary by dbShards Team - No Comments »

Black Box vs. Application-Aware Sharding

Monday, September 20, 2010

There are many techniques for partitioning a database, all under the heading of “sharding.” But regardless of which technique you are referring to, the purpose is basically the same – spread your data out in a manner that allows parallel processing across multiple servers, allowing your database to scale with load. Done correctly, the results can be pretty amazing, frankly even surprising some of us on the dbShards team. In short, sharding really works.

Lately we have seen a lot of talk about the black box sharding capabilities offered by some products. The way this works is to have the data sharded mechanically by the product, so that the developer or user of the database doesn’t need to pay attention to it. In essence, the sharding product that “auto-magically” partitions your data.

When you drill down further, you discover that such techniques distribute data at the row level (for relational databases), or at the object level (for NoSQL databases). Each row request is then satisfied by the particular data store that contains it, retrieving the row (or rows) via a network. Most often there is a middle-tier server doing this work, so that your typical database client (e.g., the MySQL driver) can just connect to the middle-tier, and the whole thing looks just like a single monolithic database.

This concept understandably has lot of appeal, as who wouldn’t want a “plug and play” setup that does “auto-magic” scaling for your database?

The answer is that there is a cost to this type of approach. (In fact we investigated it early on for dbShards and concluded it was not the best long-term way to address sharding of a relational database.) Because rows for all tables are sharded mechanically, that means that any row can end up in any shard. The network access for each row adds overhead, even if queries are somehow optimized so that groups of rows are grabbed from a given partition with a single call. For single row reads/writes (just like the NoSQL databases) this is not much overhead, but when it comes to sets of data, the cost can be expensive in terms of application performance. Think of joins, aggregates, sorts – all of the things you do frequently in an application, and imagine that instead of a local disk or memory read (as is the case with MyISAM or InnoDB) you have to read across the network access instead. This is especially burdensome in cloud environments where you don’t have control of shared network I/O, which is generally slower than dedicated environments to begin with.

It follows then that when you add more partitions (dozens or 100s as these products claim) we believe the degradation for multi-row sets will be dramatic. This means that as your database gets bigger, you could see a “hockey stick” degradation curve, something that is hard to predict and the last thing you want to encounter mid-stream in a production application.

With dbShards we use a technique called Application-Aware Sharding, which really means that you as the developer decide up front how to shard your database, based on your application’s specific requirements.

The primary rationale for black box partitioning approaches is that “application-based sharding requires a lot of sophistication” to implement and operate. It’s true that doing this totally on your own can be challenging, but honestly the concepts of how to go about it are extremely simple and easy to grasp.

Here is why.

The truth is that in any given application, only a few tables grow to a large enough size with high transaction volumes to even justify sharding. From our experience this is usually 5% – 10% of the total number of tables, meaning that you really only need to shard 5 or 10 tables in an application that has 100 tables in its schema. Identifying them is trivial, just look at your largest, most active tables and the list will pop out at you. These tables naturally form a “shard tree” with a single parent table (e.g., a “user” table), and there you have all of the basics for your sharding strategy. (Sometimes we find applications with 2 or 3 shard trees, but still the number of tables is small).

By definition, the shard tree is made up of related tables (after all, that’s what a relational database is all about). With Application-Specific Sharding, all related data is located in the same shard (e.g., “order” data for a given user), and with optimal shard sizing, this technique uses the power of proven database engines to achieve incredibly fast read/write access. Just remember how fast your database was when your application first started out, and multiply that times a number of shards, and you immediately see the potential. Joins, multi-row result sets, aggregates perform extremely well because your most frequent accesses stay within a single shard. There is no middle-tier, with nothing between your application and your database. Sharding decisions are made by the database driver itself, with application requests going directly to the database engine (just like they always have). Less frequent “Go Fish” queries are performed in parallel across multiple shards when needed, again totally seamless to your application.

Herein lies the biggest difference between this technique and the black box sharding described above. With mechanical sharding techniques, you give up control of the partitioning logic, and all of your data is distributed (not just the tables you really need to shard). Because related data can be anywhere, over time and with more nodes in the data layer, the results can be very unpredictable, especially if you need multi-table joins, aggregates or other complex query support.

Once you have identified your sharding strategy, a product like dbShards makes the rest very easy to do. A simple dump from your current database can be loaded into the sharded environment, and other features such as Global Tables, reliable replication, continuous operation, re-sharding, and parallel “Go Fish” queries, make the whole environment virtually transparent to your application. The point is, however, that Application-Aware Sharding does take some up front planning and analysis to make it work (an hour or less if you know your app), and sometimes minimal application changes are required if you want to achieve the best performance possible. This is the case with any sensible performance tuning effort, so regardless or your direction you should plan for it.

Our tools guide you through the process, and based on the results our customers are experiencing, it’s well worth the effort. We have real production applications writing billions of rows, with incredibly fast read rates. For example, we have seen 650,000 row reads/second in cloud-based applications using dbShards in a MySQL environment with just a small number of shards.

Another nice benefit of Application-Aware Sharding is that you are in total control of your database performance, with simple tools and “hints” to further tune data access behavior when necessary, just as you would with any other database environment.

The result is predictable, linear (or better) scaling for your application for its entire lifespan.

In summary, if your application needs to scale, spend the time to drill down to the details, investigating all the alternatives and make sure that you choose the best solution for your particular needs.

Posted in News & Commentary by Cory Isaacson - 2 Comments »