Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account

Hypertable (et al.) v. Postgres, SSI v. MSI clustering, and hardware scaling

0

69 views

First, a few specifics on what I'm working with and my tentative plan.

  • Hardware: 15 machines; each with 2x 3.0 GHz "Nocona" (mid-2004) Xeon, 4 GB DDR2, 2x 73 GB SCSI.
    Aggregately, this works out to: 30x 3.0 GHz CPU, 60 GB RAM, 2.14 TB HDD.
    .
    (There will also be some extra decently powered machines in the cluster, but this is the bulk of it.)

  • Architecture: One high-powered single-system-image (SSI) cluster of five of those machines plus any extras, with the remaining ten working independently.

  • Database: Hypertable, set up on the 10 remaining nodes to work over HDFS across them.

  • The SSI cluster will act as the Hypertable master and run a custom application and the nginx Web server over Kerrighed/XtreemOS, while the remaining servers will run some form of Debian.

  • The database is meant to scale and perform well given high levels of reading and writing. (~7000 users)


First of all, are there any glaring issues with this plan?

Now, my main questions:

  1. How does the current architecture compare to a fully SSI'd setup or a fully MSI'd setup (LinuxPMI) in terms of performance and scalability?

  2. Given that I'm working with some pretty old CPUs, does this hardware seem adequate, or should I invest in more machines?

  3. For my use case, what are the relative pros and cons of a datastore like Hypertable v. a more traditional RDBMS like PostgreSQL? Would it make more sense to use HT exclusively or to give part of the data (say, user management) to Postgres (or even to use Postgres exclusively)?

  4. Going off the last one, at this point in time how much of a risk am I taking in using Hypertable over HBase (or Cassandra)? It's worth noting that this won't need to be deployed until after the summer, around the time of Hypertable's scheduled 1.0 release.

  5. I'm also planning on using Go, so what are your thoughts on go-pgsql and thrift4go?

Thanks.

asked May 14, 2011 3:40 pm CDT
posted via ServerFault

1 Answers

0
 

Respectfully, you are very confused and your question cannot be answered. You're wanting to compare:

  • Several NoSQL data stores with different data models (Hypertable, Cassandra, HBAse)
  • A relational database (Postgres)
  • A couple of clustering frameworks (SSI, MSI)
  • A brand new programming language with little adoption (Go)

and you're providing no specifics on what your application does, no code profiling, and no end user usage patterns. There is no way to fix this; you're comparing apples to oranges to banana peels, and you're giving no factual information.

This calls for the classical quote from Donald Knuth:

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about or worrying about the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance a considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3 %. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.

You might want to break this question up into several questions, each focusing on one technology, and with specifics about what you're doing and what problems you're seeing.

answered May 14, 2011 6:15 pm CDT

Your answer

Join with account you already have


Sign in with Twitter account
Sign in with Facebook account
Sign in with Google Friend Connect

Preview
Similar questions
Mysql clustering
April 1, 2011
Ejabberd clustering
March 1, 2011
Scaling MySQL Database
February 10, 2011
Scaling a follower model
December 14, 2010
MySQL Hardware Specs
March 11, 2011