Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account
List archives

Follow-up post on cassandra configuration with some experiments on GC tuning

Re: 0.7 beta 1 - "error in row mutation" and NPE
(84 lines)
linux flavor?
(9 lines)
Aug 24, 2010
Mikio Braun
Mikio Braun
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dear all,

thanks again for all the comments I got on my last post. I've played a
bit with different GC settings and got my Cassandra instance to run
very nicely with 8GB of heap.

I summarized my experiences with GC tuning in this follow-up post:

http://blog.mikiobraun.de/2010/08/cassandra-gc-tuning.html

- -M

- -- 
Dr. Mikio Braun                        email: mik### @cs.tu-berlin.de
TU Berlin                              web: ml.cs.tu-berlin.de/~mikio
Franklinstr. 28/29                     tel: +49 30 314 78627
10587 Berlin, Germany



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkxz5WcACgkQtnXKX8rQtgDiiwCeLknuTcr65eehwIcsivInjv4W
LaQAn3RY9pH19r8SuUhVBvtE6LeyFUvB
=MYsY
-----END PGP SIGNATURE-----


Reply
Tags: summarizedheap8gbnicely
Messages in this thread
Follow-up post on cassandra configuration with some experiments on GC tuning
Similar Threads
nio_cahr] Issue about CharsetEncdoer.flush() does not follow the spec and RI5 also doesn't follow t
See https://issues.apache.org/jira/browse/HARMONY-6594
<https://issues.apache.org/jira/browse/HARMONY-6594>
In Java5 Spec, the flush() method should always be invoked after reset()
or the
three-argument
encode<http://../../../java/nio/charset/CharsetEncoder.html#encode(java.nio.CharBuffer,+java.nio.ByteBuffer,+boolean)>
 method with a value of true for the endOfInput parameter, otherwise, an
IllegalStateException will be throwed. Harmony's implementation does not
implement this logic, when an encoder is created and followed by calling
its
flush() method,  flush() should throw IllegalStateException. I have fix
previous case with
HARMONY-6594<https://issues.apache.org/jira/browse/HARMONY-6594>.
 However I checked that RI5 also  does not completely follow the spec. 
For
the invocation sequence: reset -> encode with 3 arguments -> reset 
->
flush, RI5 throw IlegalStateException against the spec.
(seeorg.apache.harmony.nio_char.tests.java.nio.charset.ASCIICharsetEncoderTest.testInternalState_Flushed()
)
and sequence: encode(Charbuffer) -> flush(), RI5 doesn't throw
IlegalStateException against the spec. (after encode(Charbuffer), the
encoder should be in FLUSH state)
 (see
seeorg.apache.harmony.nio_char.tests.java.nio.charset.ASCIICharsetEncoderTest.testInternalState_from_Encode)

Further Investigation shows from Java6 Spec, this behavior is changed, it
says  flush() will throw IllegalStateException if the previous step of the
current encoding operation was an invocation neither of the
flush<http://../../../java/nio/charset/CharsetEncoder.html#flush(java.nio.ByteBuffer)>
 method nor of the three-argument
encode<http://../../../java/nio/charset/CharsetEncoder.html#encode(java.nio.CharBuffer,+java.nio.ByteBuffer,+boolean)>
 method with a value of true for the endOfInput parameter. And actually,
RI5
follows the java6 spec rather than java5!

So now I am confused if we should modify our harmony trunk CharsetEncoder
to
comply with the java5 spec or in other hand modify it to comply with RI5
and
java6 spec for above 2 cases? Anyone could give me some suggestions for
this
point?


Cassandra-related post by a friend
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)My friend Mikeal
posted this on his blog, including a discussion of
Cassandra versus CouchDB and MongoDB:

http://www.mikealrogers.com/2010/07/m...mance-durability/

I've emailed him a couple clarifications on the discussion of Cassandra,
but it's mostly spot-on and a good read on the status of the evolving
non-relational space.





Ruby and Cassandra blog post
hey everyone,

I've been using Cassandra with Ruby for about a month now and am
finding it very helpful.  I wrote up a blog post about my experiences,
with the goal of adding more Ruby-specific examples to the
blogosphere.  Hope this is helpful!

http://www.subelsky.com/2010/05/real-...nd-cassandra.html

-Mike


Post on experiences with Cassandra for Twitter retweet analysis
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

I've put a blog post where I discuss our experiences with using
Cassandra as the main database backend for twimpact. Twimpact is
research project at the TU Berlin which aims at estimating user impact
based on retweet analysis. A live version of the analysis for the
japanese market can be seen at http://twimpact.jp

So far, we're very pleased with Cassandra performance, but we've also
had to overcome some issues on which I report in the blog and which are
hopefully interesting for other users of Cassandra.

The blog post can be found here:

http://blog.mikiobraun.de/2010/08/-cassandra-tips.html

- -M


- -- 
Dr. Mikio Braun                        email: mik### @cs.tu-berlin.de
TU Berlin                              web: ml.cs.tu-berlin.de/~mikio
Franklinstr. 28/29                     tel: +49 30 314 78627
10587 Berlin, Germany



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkxjvwYACgkQtnXKX8rQtgB3AQCcCOuWhVePsWQt81uspETC4Zg3
s2MAn2wH/1xxOuTWGXpgmEyzI4Hmi99+
=08Y9
-----END PGP SIGNATURE-----


Cassandra configuration settings
Hi All,
Could u please give configuration settings for "single node"(Windows
machine), so that it must be "time and space efficient".





Thanks,
Sharan


apache tuning for svn
Hi Gurus,

Any advise on initial tuning values for apache MinSpareServers,
MaxSpareServers, and StartServers, tcp tunings, ulimits, sysctl....

I'm running svn 1.6 over apache2 pre-fork. System load goes high as
much as 10 during heavy usage.

What would be your recommended MaxRequestsPerChild for heavy svn usage?

Thanks,


West


Tuning garbage collection
Hi,

I'm using Tomcat 6.0.26, Java 1.6 and wondering what tools/strategies you
use to tune your garbage collection parameters?

Further, does anyone know how to read entries in the garbage collection
log? 
Entries in my log look like

Desired survivor size 10944512 bytes, new threshold 1 (max 15)
 [PSYoungGen: 129311K->3232K(136512K)] 558882K->434085K(585920K),
0.0090900
secs]

Thanks, - Dave




Re: performance tuning - where does the slowness come from?
Weijunli,

I also have an environment that has similar very large datasets with
strict
latency.  Can you please elaborate on the custom changes you added to
cassandra to meet these sla, either code or configuration.  i am very
interested in learning more about the internal workings of cassandra and
performance.

Thanx,
Artie

On Thu, May 6, 2010 at 10:06 AM, Weijun Li <weiju### @gmail.com>
wrote:

 Our use case is a little different: our server is a typical high
volume
 transaction server that processes more than half billion requests per
day.
 The write/read ratio is close to 1, and the cluster needs to serve
10k
 write+read with strict latency (<20ms) otherwise the client will
treat it as
 failure. Plus we have hundreds of millions of keys so the generated
sstable
 files are much bigger that the ram size. In this case using mmap will
cause
 Cassandra to use sometimes > 100G virtual memory which is much
more than the
 physical ram, since we are using random partitioner the OS will be
busy
 doing swap.

 I have finally customized cassandra to meet the above requirement by
using
 cheap hardware (32G ram + SATA drives): one thing I learned is that
you have
 to carefully avoid swapping, especially when you need to cache most
of the
 keys in memory, swap can easily damage the performance of your in
memory
 cache. I also made some optimizations to reduce memory/disk
consumption and
 to make it easier for us to diagnose issues. In one word: cassandra
is very
 well written but there's still huge potential for you to improve it
to meet
 your special requirements.

 -Weijun


 On Wed, May 5, 2010 at 9:43 AM, Jordan Pittier
<jordan.p### @gmail.com>wrote:

> I disagree. Swapping could be avoided. I don't know Cassandra
internals
> mechanisms but what I am expecting is that whenever I want to
read rows that
> are not in RAM, Cassandra load them from hard drive to RAM if
space is
> available, and, if RAM is full to reply my query without saving
rows in RAM.
> No need for swapping.
>
> I have no try yet to change DiskAccessMode to standard, I hope it
will
> help me.
>
> Another thing : please dont post your benchmark figures without
any
> explanation on the work load generator or your cluster settings.
It really
> doesn't make any sense...
>
>
> On Wed, May 5, 2010 at 6:16 PM, Weijun Li
<weij### @gmail.com> wrote:
>
>> When you have much more data than you can hold in memory, it
will be
>> difficult for you to get around of swap which will most
likely ruin your
>> performance. Also in this case mmap doesn't seem to make much
sense if you
>> use random partitioner which will end up with crazy swap too.
However we
>> found a way to get around read/write performance issue by
integrating
>> memcached into Cassandra: in this case you need to ask
memcached to disable
>> disk swap so you can achieve move than 10k read+write with
milli-second
>> level of latency. Actually this is the only way that we
figured out that can
>> gracefully solve the performance and memory issue.
>>
>> -Weijun
>>
>>
>> On Wed, May 5, 2010 at 8:19 AM, Ran Tavory
<rant### @gmail.com> wrote:
>>
>>> I'm still trying to figure out where my slowness is
coming from...
>>> By now I'm pretty sure it's the reads are slow, but not
sure how to
>>> improve them.
>>>
>>> I'm looking at cfstats. Can you say if there are better
configuration
>>> options? So far I've used all default settings, except
for:
>>>
>>>     <Keyspace Name="outbrain_kvdb">
>>>       <ColumnFamily CompareWith="BytesType"
Name="KvImpressions"
>>> KeysCached="50%"/>
>>>
>>> 
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackAwareStrategy</ReplicaPlacementStrategy>
>>>      
<ReplicationFactor>2</ReplicationFactor>
>>>
>>> 
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
>>>     </Keyspace>
>>>
>>>
>>> What does a good read latency look like? I was expecting
10ms, however
>>> so far it seems that my KvImpressions read latency is
30ms and in the system
>>> keyspace I have 800ms :(
>>> I thought adding KeysCached="50%" would improve my
situation but
>>> unfortunately looks like the hitrate is about 0. I
realize that's
>>> application specific, but maybe there are other magic
bullets...
>>>
>>> Is there something like adding cache to the system
keyspace? 800 ms is
>>> pretty bad, isn't it?
>>>
>>> See stats below and thanks.
>>>
>>>
>>> Keyspace: outbrain_kvdb
>>>         Read Count: 651668
>>>         Read Latency: 34.18622328547666 ms.
>>>         Write Count: 655542
>>>         Write Latency: 0.041145092152752985 ms.
>>>         Pending Tasks: 0
>>>                 Column Family: KvImpressions
>>>                 SSTable count: 13
>>>                 Space used (live): 23304548897
>>>                 Space used (total): 23304548897
>>>                 Memtable Columns Count: 895
>>>                 Memtable Data Size: 2108990
>>>                 Memtable Switch Count: 8
>>>                 Read Count: 468083
>>>                 Read Latency: 151.603 ms.
>>>                 Write Count: 552566
>>>                 Write Latency: 0.023 ms.
>>>                 Pending Tasks: 0
>>>                 Key cache capacity: 17398656
>>>                 Key cache size: 567967
>>>                 Key cache hit rate: 0.0
>>>                 Row cache: disabled
>>>                 Compacted row minimum size: 269
>>>                 Compacted row maximum size: 54501
>>>                 Compacted row mean size: 933
>>> ...
>>> 
Scanning through 12k rows, performance tuning suggestions?
Hi list,

I just uploaded a screen shot of some profiling result to flickr:
http://www.flickr.com/photos/alexdong/4884059360/

This query will scan around 12k rows, my c++ code will read one column
and parse the row key into 3 parts. I've already found that the row
key parsing code can be optimized to be at least 6 times faster by re-
designing the row key and use more efficient parsing method.  I hope
that will lift the performance of this query from 15 seconds down to 2
- 3 seconds.  But it's still a fair bit distance from our sub-second
goal.

Looking at the profiling info, I'm surprised to see the disk read
speed is max around 9M and it's not continuous. I was expecting to see
something more like a continuous read with 30 - 50M per second speed.
I'm guessing that since each CellStore is only 65k, disk fragmentation
might cause some data discontinuity here?

Anything I can do to lift the speed here? Or maybe I'm not using
hypertable the way I should?

Thanks,
Alex





APR or Apache, virtual hosts and multi-core tuning...
Hi All,

I noticed using Firefox with a plugin yslow that it recommends using only
4 virtual hosts instead of the 16 I have for serving images. I am using a 6
core system and want to make sure I take advantage of it using Apache APR
so I setup more than 4 virtual hosts.

I am using multiple virtual hosts for serving images since I heard that
can improve user response time where they have browsers with 4 or more
concurrent connections supported. I have a screenshots page where I have 24
thumbnails being loaded into the browser and I am trying to tune APR for
the best user response time.

So any best practice for this you all would recommend?

Thanks,
-Tony


      


Correct listing of resource children (follow up to SLING-1672)
Hi

When getting a resource the case seems to be clear:
The first resource provider which returns a resource
*wins*. And the resource providers are called in order
starting with the provider which is registered for the
longest part of the requested path.
With ResourceResolver#listChildren it's a bit trickier.
Assume the following:

structure in the JCR:

/foo
/foo/bar
/foo/bar/test

and in another resource provider:

/foo/bar
/foo/bar/myresource

case 1)
ResourceResolver#listChildren( "/foo/bar" ) should now
list the following

test
myresource

case 2)
Assume another provider:
/some/path/resource
/some/path/resource2

What should ResourceResolver#listChildren( "/" ) list?
From my understanding it should list:

foo
some

where may be a SyntheticResource.

case 1) and case 2) are not returning the expected result, at least
not if you use a bundle resource provider. I haven't looked into the
details so I can't say if it's a problem of the bundle resource
provider or a more general problem with the resource resolver
implementation.

Maybe security could be a problem. But a resource provider at least
can access the user id via ResourceResolver#getUserID, and list children
only if access is allowed. I don't know if this behavour of a
resource provider is intended.

WDYT?

best regards
mike


Created: (COCOON-2297) Character encoding does not follow JTidy properties
Character encoding does not follow JTidy properties

Created: (CXF-2917) Have package naming more closely follow Appendix D.5.1 of JAXB 2.0/2.2 specifica
Have package naming more closely follow Appendix D.5.1 of JAXB 2.0/2.2
specification

Created: (AVRO-559) Handle read_union error where the list index of the union branch to follow excee
Handle read_union error where the list index of the union branch to follow
exceeds the size of the union schema

Created: (FELIX-2510) Configuration not provided to components if Configuration Admin is not active
Configuration not provided to components if Configuration Admin is not
active while setting up components

Created: (CONFIGURATION-422) Configuration.getInt() should define its accepted inputs, and should al
Configuration.getInt() should define its accepted inputs, and should also
accept octal

Created: (CONFIGURATION-421) VFSFileChangedReloadingStrategy.init() uses configuration before checki
VFSFileChangedReloadingStrategy.init() uses configuration before checking
it for null

Created: (FOR-1197) Follow the ASF "Apache Project Branding Guidelines"
Follow the ASF "Apache Project Branding Guidelines"