Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account
List archives

Emulating a LinkedHashMap-like structure in Voldemort

dynamic scaling question
(24 lines)
JMX Statistics Changes
(20 lines)
Aug 23, 2010
Alex Kravets
Alex Kravets
We have a specific use case:  Millions of log entries each of which is
a small JSON object.  They occur in a chronological order are
timestamped and are also keyed by GUID's.

Question: Given our desire to iterate through entries in the same
chronological order, what is the recommended strategy to store them in
Voldemort stores ?

Thanx in advance for any suggestions.





Reply
Tags: entriesiteratedesire
Messages in this thread
Emulating a LinkedHashMap-like structure in Voldemort
Similar Threads
list of structure..
Hello everybody!

We have puppet infrastructure integrated with ldap. I can pass
different variables via puppetVars attribute in ldap.
Currently, I need to write puppet class to configure mod_proxy server.
mod_proxy servers has 5-10 configured sites. Every site configuration
has some variables: site name, listen port, destination site name,
destination port, protocol..
Does anybody has an idea, how can I place all this variables out of
puppet classes? In ldap? Or maybe puppet have some internal database,
like chef?





Erlang Doc structure
I've been playing with Erlang views on and off in CouchDB, and find the
information on the internal Erlang structure of Docs lacking. Is there
somewhere I can read up on this? I had initially assumed that a Doc was a
list of tuples, but I am starting to suspect that I am wrong.

Mike

arbitrary structure for joins redux
Hi,
I recently discovered the new feature added in couchdb 0.11 which lets to
link related documents in the emit function by passing the _id value of a
given doc like it's explained in
http://blog.couch.io/post/446015664/w...11-part-two-views
I find this feature quite useful, but I would like to know if its planned
to
let use an arbitrary structure as the value for the emit function which
could contain several _id's references like emit(key, {title:"lorem
ipsum",
foo:{_id : "foo"}, bar : {_id : "bar"}})
so there could be linked more than one document with a single emit call. a
possible solution could probably be to call the emit function several
times
but then include_docs parameter could not be used when a reduce function
is
defined.

Thanks in advance

Regards




JMS-MAP-JSON Improper Data structure
I am getting an odd error in my data from my broker. Using perl I am
changing
a map message to json using jms-map-json. This returns an improperly
structured version of my data.



'map' => [
                     {
                       'entry' => [
                                    {
                                      'string' => [

As you can see it is creating an array where the hash should be.

I now have to access data like @{@{$json->{map}}[0]->{entry}}; Which
is very
unpleasant. We did not have this issue prior. I used to be able to simply
do

@{$json->{map}{entry}} to access my data


This is with amq 5.3.2




Need advice about docs structure design.
Hy guy's.

My model:
Users has
  Bookmarks and Posts

have idea keep Bookmarks and Posts in separate DB (for speed views).
But in view i want fetch Bookmarks and Posts with user name,

In what way i can do this?
keep copy of users in each DB?

or may be keep users in separate DB, fetch bookmarks or posts and then
fetch users and then merge?

Yes i know i can copy user name to doc, but have a trouble when user
name is changed (it's force me use stale=ok in request view
+background process to track changes and run rebuild index).

I think it's will be cool crazy stuff if i can update docs(cached
fields - not have effect to index) and say Couch you don't need
rebuild index relax ;).

Any advice?


RE: Delete document Tree Structure

   hi ,

         I am very new to couchdb,  in my design , i just keep parent id
of each documents.
such as if i would like to delete B, i know that to delete C(have B id)
and then I have find and
to delete E(have c id).... etc.

        my question is ....

        1. is this possible ????

        2. how do i pass the document id (such as B) to view that i would
like to search and delete document under its? 

        3. dose couchdb support delete document in view or i have to query
all of them than use http api to delete ?

         

thanks for every ideas 

A.







-------- Original Message --------

  
    
      Subject: 
      Re: Delete document Tree Structure
    
    
      Date: 
      Thu, 27 May 2010 06:43:58 +0200
    
    
      From: 
      J Chris Anderson <jch### @gmail.com>
    
    
      Reply-To: 
      use### @couchdb.apache.org <us### @couchdb.apache.org>
    
    
      To: 
      use### @couchdb.apache.org <us### @couchdb.apache.org>
    
  





On May 26, 2010, at 8:48 PM, Aun... ??????? wrote:

 
 
 Hi,
 
       I design document in couchdb to have relation something like
 
              directory
                    |
                    A---------D
                    |
                    B---------C
                                 |
                                 |
                                 E ------F
 
         if i would like to http (DELETE B) will delete all down
documents what is the possible solution to use.
 
                1.    can i create view and find relation and delete
all of them ?
                2.    get  relation by higher programming api (C#,PHP)
and then delete each one?
                3.    Do you have any solution to suggest on
this.....?
 

It is common to store the full path to each item, on the item, so you'd
have

B > C > E > F stored on F

then you can view easily across the tree.

however, reparenting a node, (say, moving B to become a child of D)
require asynchronous processing or a bulk docs request and is not
transactional.

Chris


 
 Thanks,
 A. 
 
 
 		 	   		  
 
Delete document Tree Structure

Hi,

       I design document in couchdb to have relation something like

              directory
                    |
                    A---------D
                    |
                    B---------C
                                 |
                                 |
                                 E ------F

         if i would like to http (DELETE B) will delete all down documents
what is the possible solution to use.

                1.    can i create view and find relation and delete all
of them ?
                2.    get  relation by higher programming api (C#,PHP) and
then delete each one?
                3.    Do you have any solution to suggest on this.....?


Thanks,
A. 


 		 	   		  

adding request structure access to mod_header
We were dancing around trying to get RequestHeader (mod_header) to add
the authenticated user to a proxy request. The approaches we found
involved tricking other modules into calling ap_add_common_vars to
  apr_table_addn(r->subprocess_env, "REMOTE_USER", r->user);

I decided to simplify by adding %{}r directives à la
  RequestHeader set "x-webobjects-remote-user" "%{user}r"
which interrogate the request structure directly. I re-used
the request structure names 'cause, well, why not?


Please try to reply to <er### @w3.org>. <eri### @gmail.com> is a
temporary hack as ezmlm seems to subscribe the address in the
Return-Path: instead of the From: (grr!).



Message data structure merge heads up
  Hi guys,

just to give you some info about the ongoing merge (I'm removing the 
MessageCodec hierarchy, replacing it with the InternalMessage hierarchy, 
in the server and on the client api).

So far, I have successfully been able to use the InternalMessage classes 
to decode the AbandonRequest, BindRequest, UnbindRequest, DelRequest. 
However, the associated Codec messages are still present, as the API is 
using them.

OTOH, and because it's easier, I'm removed completely the AddResponse, 
BindResponse and DelResponse codec classes. They have been replaced by 
their equivalent classes from InternalMessage. This has been done in the 
clinet API and in the server.
It's going fast, as all those data structure are simular.

There are two things to be done when all those response messages will 
have been merged :
- remove the API Response message classes, to use the InternalMessage 
response classes
- Fix the dsml-parser which is using the codec hierarchy

A third step would be to get rid of all the Request codec classes, this 
will come next.

I think that I have 4 more days to get all this done. It's not exactly 
fun, but at least, we will have some cleaner implementation...





Re: Message data structure merge heads up
  On 8/12/10 4:24 PM, feez### @gdls.com wrote:
 Gentlemen,
Hi,
 My home system is Fedora Core 13 Linux with the Sun JDK 1.6_20,
 Subversion, Maven 2.2.1, and Eclipse 3.6 installed.
Perfect.
 Back at a command prompt I tried running the following to make sure
 everything was working.

 4)      "mvn test"

 Only one test "testSaslGssapiBind" is failing, and, looking at the
test
 code, it appears that the author doesn't expect this one to work yet.
Strange. All the tests are passing on our linux machines.

Have you tried mvn clean install -Dintegration ?
 At my day job I have a Windows XP SP-3 with Sun JDK 1.6_17, Maven
2.2.1,
 Subversion, and Eclipse 3.6 installed.

 Access to the Internet from this system is restricted to using an
HTTP
 proxy that requires NTLM authentication.
Thanks a lot, M$ ...

<snip/>
 Again the same three steps mentioned above were completed
successfully
 (despite the erratic operation of the proxy server).

 When I tried "mvn test", however, I got several failures.

 One of the failures, "testSaslGssapiBind", is the same as on Linux. 
For
 now I'm assuming this is a known problem that is being worked on.

 Using Eclipse to investigate each of the others I've discovered that
two
 of them are related to Windows's use of the "\" character as the path
 separator and one is related to incorrect handling of "escaping" of
 characters in filenames.  I'm developing fixes for these issues now
and
 will post suggested patches after I complete testing.
Great ! We don't use W$ at all, so it's likely we have some tests 
failure if we are not cautious enough. That's the price to pay for being 
efficient...

 The only remaining test failure, is "testSearchUTF8" in
 "ClientSearchRequestTest" which is not throwing the expected
Exception.  I
 haven't investigated this one yet but plan to when time is available.

Hmmmm... I don't find this class. In which module did you found it ?







adding request structure access to mod_header
We were dancing around trying to get RequestHeader (mod_header) to add
the authenticated user to a proxy request. The approaches we found
involved tricking other modules into calling ap_add_common_vars to
  apr_table_addn(r->subprocess_env, "REMOTE_USER", r->user);

I decided to simplify by adding %{}r directives à la
  RequestHeader set "x-webobjects-remote-user" "%{user}r"
which interrogate the request structure directly. I re-used
the request structure names 'cause, well, why not?


Please try to Cc: <er### @w3.org>. <eric### @gmail.com> is a
temporary
hack as ezmlm seems to subscribe the address in the Return-Path:
instead of the From:




Is it possible to use instanceof operator in control structure tags (if, choose, etc.)?
I've searched for instanceof in the documentation but I can't see any
intance of it--pun unintended. :) Anyway if it's possible, how do I
use it inside a Mapper config? Can I declare a typeAlias in MyBatis
config and use that alias? Or do I need a fully qualified class name?

I'm going to use it for inserting objects that belong to a type
hierarchy for which I only have one table for. I'm using a type column
to determine the actually classes of the records.

Thanks.


Where is ProducerInfo+ConsumerInfo data structure for Advisory Msgs?
On page

http://activemq.apache.org/advisory-message.html

data structures named ProducerInfo + ConsumerInfo are mentioned.
Where do I find these data structures explained (resp. which fields they
contain in detail) ?

Furthermore I wonder how I can detect wether an incoming msg on an
Advisorc
topic is a ProducerInfo OR ConsumerInfo type msg? On the web page above a
java sample is shown where an incoming msg is always casted on
ProducerInfo:

ProducerInfo prod = (ProducerInfo) aMsg.getDataStructure();

Why ProducerInfo and not ConsumerInfo?

I expected some sort of investigation first:

if (aMsg instanceOf ProducerInfo) {
   ProducerInfo prod = (ProducerInfo) aMsg.getDataStructure(); }
else  {
   ConsumerInfo prod = (ConsumerInfo) aMsg.getDataStructure(); }
....

So how should this look like officially? Is there a more elaborated java
example?

Ben









Re: Load multiple files from a date range (available in the directory structure)
Hi,

Have a look at the new HiveColumnarLoader (pig 0.8.0) implementation in
piggy bank. It uses to classes which I kept separated from the loader
itself to support partition loading. The only thing is the partitions need
to have the folder name forma key=value

E.g

Date partitions:
/logs/date=2010-08-14
/logs/date=2010-08-15



----- Original Message -----
From: Arun A K <arnkr### @gmail.com>
To: pig-us### @hadoop.apache.org <pig-use### @hadoop.apache.org>
Sent: Wed Aug 18 19:35:46 2010
Subject: Load multiple files from a date range (available in the directory
structure)

Hi

I have the following scenario-

Pig version used 0.70

Sample HDFS directory structure:
/user/training/test/20100810/<data files>
/user/training/test/20100811/<data files>
/user/training/test/20100812/<data files>
/user/training/test/20100813/<data files>
/user/training/test/20100814/<data files>

As you can see in the paths listed above, one of the directory names is a
date stamp.

Problem: I want to load files from a date range say from 20100810 to
20100813.

I can pass the 'from' and 'to' of the date range as parameters to the Pig
script but how do I make use of these parameters in the LOAD statement. I
am
able to do the following
temp = LOAD '/user/training/test/{20100810,20100811,20100812}' USING
SomeLoader() AS (...);

But how do I make use of the parameters to the Pig script? Do I need to
make
use of a higher language like Python to capture all date stamps in the
range
and pass them to LOAD as a comma separated list?

Thanks for your time.

cheers
Arun A K
Graduate Student
Department of Computer Science
Indiana University, Bloomington


Load multiple files from a date range (available in the directory structure)
Hi

I have the following scenario-

Pig version used 0.70

Sample HDFS directory structure:
/user/training/test/20100810/<data files>
/user/training/test/20100811/<data files>
/user/training/test/20100812/<data files>
/user/training/test/20100813/<data files>
/user/training/test/20100814/<data files>

As you can see in the paths listed above, one of the directory names is a
date stamp.

Problem: I want to load files from a date range say from 20100810 to
20100813.

I can pass the 'from' and 'to' of the date range as parameters to the Pig
script but how do I make use of these parameters in the LOAD statement. I
am
able to do the following
temp = LOAD '/user/training/test/{20100810,20100811,20100812}' USING
SomeLoader() AS (...);

But how do I make use of the parameters to the Pig script? Do I need to
make
use of a higher language like Python to capture all date stamps in the
range
and pass them to LOAD as a comma separated list?

Thanks for your time.

cheers
Arun A K
Graduate Student
Department of Computer Science
Indiana University, Bloomington