Best unofficial Apache Server developers community
Username
Forgot password?
Sign in with Twitter account
Sign in with Facebook account
List archives

simulating remove + limit

map/reduce with a sorted, indexed, and limited query
(31 lines)
1.6.0 _recvChunkStatus failure logged every second
(54 lines)
Aug 11, 2010
Dima Brodsky
Dima Brodsky
Hi,

I am using mongoDB as a queue and it is working quite well.  I am  
using findAndModify with the remove flag set to true to return a  
single element and immediately remove it from the collection.  What  
I'd like to do is be able to fetch and delete several items at once,  
or atleast to minimize the number of calls done to the mongo server.   
What would be the best way to simulate  a 'remove' with a limit?

Before mongodb I used mysql and I would do something like:

update jobQ set worker = %s order by insertTime limit 20
select * from jobQ where worker = %s
delete from jobQ where worker = %s

once a set of jobs were assigned to a particular worker there was no  
way for another worker to pick them up etc ... I'd like to repro the  
above three lines in mongo.

Thanks!
ttyl
Dima



"The price of reliability is the pursuit of the utmost simplicity.
It is a price which the very rich find the most hard to pay."
                                                                      
(Sir Antony Hoare, 1980)





Reply
Tags: removefetch
Messages in this thread
simulating remove + limit
reply Re: simulating remove + limit
(49 lines) Aug 12, 2010 15:21
Similar Threads
Simulating SQL 'LIKE %' using Regular Expressions
Hi, so in my couchDB, what I want to happen is when you query a view in 
the design doc and providing ?key="C" at the end, bring up all shop names 
that start with C.
to do this I have made the following map function:

function(doc) 
{ 

  for (id in doc.Stores)
  {
 
  var strqry = doc.Stores[id]["name"].match(/^.*/);
 
    if (strqry)
    { 
      emit(strqry, doc.product_name + " qty: " + 
doc.Stores[id].item_count);
    }
  } 
}

When I run this from the temp view in Futon, I get a list of all stores 
and products (which made me think 'yay it worked, all it needs is a 
parameter') but when I use:
http://host:5984/db/design/name/_view/function?key="C" I get back: 
{"total_rows":30,"offset":0,"rows":[]}

my ultimate aim is to get it working similar to an SQL Like % so if for 
example I say ?key="C" it will return "Computer Store A" and so on..
I created my function based on this tutorial: 
http://books.couchdb.org/relax/refere...s-for-sql-jockeys




Simulating an auto-incrementing column
Hi,

I have a problem and I hope someone has an idea on how to solve it.

My dataset consists of just very simple key-value pairs of strings
coming from PostgreSQL using Sqoop.

1) I need to count how often a key occurs -> Easy
2) I need to count how often a key-value pair occurs -> Easy

I need to output this data to PostgreSQL again, into two tables:

a) "keys" with the columns: id, key_name, count
b) "values" with the columns: id, key_id, value_name, count

Now the ids I'm referring to don't exist yet and I'm looking into
solutions to generate them. They have to be integers/longs but they
don't have to be in any order/pattern. I'm not concerned about
performance either as this query will be run monthly at most.

Do you have any idea how I could introduce this new column into the
output of query 1)? I could easily introduce it into 2) with a join
then. I thought about using a custom reducer script but apart from the
fact that I've never done it so far it would require that there is
only one reducer so that I can simulate an auto-incrementer. My
current best idea is to write a regular MR job that processes the Hive
output but I'd love to do everything in Hive if possible.

I might very well approach this problem completely wrong so don't
hesitate to propose a better solution or bash me for my poor
understanding of Hive :)

Thanks for any input and help.

Cheers,
Lars


Columns limit
Is there any limitations on the number of columns a row can have? Does 
all the day for a single key need to reside on a single host? If so, 
wouldn't that mean there is an implicit limit on the number of columns 
one can have... ie the disk size of that machine.

What is the proper way to handle timelines in this matter. For example 
lets say I wanted to store all user searches in a super column.

<ColumnFamily Name="SearchLogs"
                     ColumnType="Super"
                     CompareWith="TimeUUIDType"
                     CompareSubcolumnsWith="BytesType"/>

Which results in a structure as follows
{
    SearchLogs : {
        "foo" : {
             timeuuid_1 : { metadata goes here}
             timeuuid_2: { metadata goes here}
        },
        "bar" : {
             timeuuid_1 : { metadata goes here}
             timeuuid_2: { metadata goes here}
        }
   }
}

Couldn't this theoretically run out of columns for the same search term 
because for each unique term there can (and will) be many timeuuid
columns?

Thanks for clearing this up for me.



Connections limit
Hello Everyone.

I need to subscribe about 4.000 clients to a server at the same time.

I'm using STOMP for now.

Does anyone know if it will be possible using ActiveMQ?

BTW, does anyone know a way to test it?

Thank you for now.




CLI limit/count
Is there anyway to limit the number of results returned from a the CLI?


Limit connections
Hello List

I have been doing plenty of goggling and reading but I must have been
reading in all the wrong places because I can not find the
information, but how do I configure apache so it limits the amount of
connections from a host(s)?

Regards
Per Qvindesland


Re: LIMIT Issue
Matt,

Which version you are on? What happens if you run your query through
grunt instead of PigServer?
I tried load-order-limit sequence on a small dataset on grunt and I
got expected results.

Ashutosh
On Wed, Aug 4, 2010 at 15:07, Matthew Smith
<Matthew### @g2-inc.com> wrote:
 Hey,



 While running in Java a LIMIT statement is not getting executed.



 /code

                      
 myServer.registerQuery("flow_firstcut = FOREACH
 data GENERATE sIP, dIP, sPort, dPort, protocol, bytes, flags;");

                        myServer.registerQuery("filtered =
FILTER
 flow_firstcut BY sIP matches 'someIP';");



                        myServer.registerQuery("O = ORDER
filtered BY
 bytes DESC;");



                        myServer.registerQuery("topTen =
LIMIT O 10;");



                        myServer.store("topTen",
outputFilePath);



 /code



 This produces a 699 line file. It should produce a 10 line file.



 /code

                        registerQuery("flow_firstcut =
FOREACH data
 GENERATE sIP, dIP, sPort, dPort, protocol, bytes, flags;");

                        myServer.registerQuery("filtered =
FILTER
 flow_firstcut BY sIP matches '"+parameters[1]+"';");



                        //myServer.registerQuery("O =
ORDER filtered BY
 bytes DESC;");



                        myServer.registerQuery("topTen =
LIMIT filtered
 10;");



                        myServer.store("topTen",
outputFilePath);

 /code



 This produces a 10 line file.



 Is there a known bug I am unaware of or can you not order then limit?

 http://hadoop.apache.org/pig/docs/r0....n_ref2.html#LIMIT

 indicates that this is a valid sequence of calls.



 Help?



 Matt




Limit on content size
Apache 2.11:

Is there a way to limit the size of transmission from clients? If for
eg if they send 1G size of transmission to our server then reject that
with some error message.


Broker limit no consumers
Dear all,
i really appreciate the great effort of appache ....
but i hav a question can i limit the no of consumers connected to specific
broker... i hav network fo brokers and need to do that ...




Thanks in advance




Querying and Limit in mongoexport
I have a collection of large objects with multiple fields and embedded
objects stored in MongoDB.
My main goal is to translate this (retrieving a subset of fields)
query to the correct query format for mongoexport:

db.things.find( { }, { subscriber.id : 1 } ).limit(10);

I got this to work somewhat without a limit just by specifying the csv
fields as subscriber.id without any query at all.

Now, I am doing a query like this for mongoexport (I know it has to be
strict JSON)
--query  "{ 'subscriber' : { 'id': { '$lt' : 2000 } } }"

Where I have a embedded subscriber object and within that object there
is a field id.

Not only is that not returning any results, but also I would like to
use $limit not less than.

I assume I do --query  "{ 'subscriber' : { 'id': { '$limit' :
20 } } }" or something like that.

Can somewhat help me out. I'm sure its extremely simple. Thanks.





Cassandra's 2GB row limit and indexing
Hi all,

I'm currently looking at new database options for a URL shortener in order
to scale well with increased traffic as we add new features. Cassandra
seems
to be a good fit for many of our requirements, but I'm struggling a bit to
find ways of designing certain indexes in Cassandra due to its 2GB row
limit.

The easiest example of this is that I'd like to create an index by the
domain that shortened URLs are linking to, mostly for spam control so it's
easy to grab all the links to any given domain. As far as I can tell the
typical way to do this in Cassandra is something like: -

DOMAIN = { //columnfamily
    thing.com { //row key
        timestamp: "shorturl567", //column name: value
        timestamp: "shorturl144",
        timestamp: "shorturl112",
        ...
    }
    somethingelse.com {
        timestamp: "shorturl817",
        ...
    }
}

The values here are keys for another columnfamily containing various data
on
shortened URLs.

The problem with this approach is that a popular domain (e.g.
blogspot.com)
could be used in many millions of shortened URLs, so would have that many
columns and hit the row size limit mentioned at
http://wiki.apache.org/cassandra/CassandraLimitations.

Does anyone know an effective way to design this type of one-to-many index
around this limitation (could be something obvious I'm missing)? If not,
are
the changes proposed for
https://issues.apache.org/jira/browse/CASSANDRA-16likely to make this
type of design workable?

Thanks in advance for any advice,

Richard


Limit of bundles and corresponding performance
Hi,

Our production environment consists of >= 500 websites and are
currently 
all written in PHP. We're considering moving from PHP to Java but are 
unsure if its the right choice and if it is even possible with that many 
sites. (memory?)

At the moment we're considering to use Felix as the container for all 
applications. So Felix will have > 500 bundles. Each bundle (at least 
the webapps) will be around 100kb each. I'd like to know whether it is 
possible to have that many bundles deployed in Felix and would like to 
know more about the performance aspect.

So in short: is it possible to deploy ~ 500 bundles on one container and 
have the same or even better performance than the same server using PHP? 
Or do you have any other (better)  suggestions?

Regards,

Sander de Groot


Limit the database size
Hello,

is it possible to set a limit on the size a database can take. For
example I would like to limit a database to 100MB.

thanks for any help!





Will the 4MB per document limit ever be raised?
Maybe some compile-time flag or config option to increase this
per-document limit? I can think of several documents that with all the
attributes and user-generated comments/votes may reach 4MB.

This just seems like another arbitrary hard-coded limitation. Nobody
likes hard-coded limitations.





Re: Redis list size limit
It would be nice for the Redis Wiki to have a few graphs that show
performance of different functions (e.g. SET inserts) in relation to
the size of the collection.

For an example, see http://en.wikipedia.org/wiki/Trie.


On Aug 3, 11:15 am, Salvatore Sanfilippo <anti### @gmail.com> wrote:
 On Tue, Aug 3, 2010 at 7:31 AM, trung <tr### @phamcom.com>
wrote:
 > Hi,

 > I have a question about the list data structure in redis. Is
there a
 > size limit of the list?

 > I know that resque uses redis's list data structure, just
curious what
 > would have happen if a bunch of jobs start to back up. Will the
list
 > blow up because it's getting too big?

 It is guaranteed that a Redis list can be at least 2 billion of
 elements in all the archs.
 it's very hard to reach this limit in the real world, you'll likely
 end the RAM much faster.

 Cheers,
 Salvatore



 > Thanks.

 > --
 > 
Re: Redis list size limit
Thanks. I feel much safer now. :)

On Aug 3, 1:15 am, Salvatore Sanfilippo <ant### @gmail.com> wrote:
 On Tue, Aug 3, 2010 at 7:31 AM, trung <tr.### @phamcom.com>
wrote:
 > Hi,

 > I have a question about the list data structure in redis. Is
there a
 > size limit of the list?

 > I know that resque uses redis's list data structure, just
curious what
 > would have happen if a bunch of jobs start to back up. Will the
list
 > blow up because it's getting too big?

 It is guaranteed that a Redis list can be at least 2 billion of
 elements in all the archs.
 it's very hard to reach this limit in the real world, you'll likely
 end the RAM much faster.

 Cheers,
 Salvatore



 > Thanks.

 > --
 > 
geocouch: limit number of points in an area
I'm looking for a way to limit number of points retrieved for an area
depending on its size. Ie When playing with the zoom on the map I want
to be able to only display and retrieved only main points instead of
all the points in this area. Is there a simple way to do that actually
?

- benoit


Modsecuirty 2.5.11 limit upload file size
Hi all,

I want to limit the file upload size from my website by Modsecuirty
2.5.11.

So i added

 # Maximum request body size we will
# accept for buffering
SecRequestBodyLimit 131072

in my modsecurity_crs_10_config.conf

And i config my ErrorDocument for the status code 413  (Anything over this
limit will be rejected with status code 413 Request Entity Too Large.)

I expect a response page with status code 413 will be return to the
browser.

However, "The connection was reset" is return from my browser.

And i can find the HTTP/1.1 413 Request Entity Too Large in the audit log.

I am sure that my customize error document is fine cause the can access it
by typing the URL in browser.
And the error documents for other staus code such as 400, 403, 500 work
fine.

I really wonder what happen on the 413 status code.

I have try to find another way to limit my file upload size, and i found a
rule seems suitable to do this:

## -- File upload limits --

# Individual file size is limited
#SecRule FILES_SIZES "@gt 1048576"
"phase:2,t:none,block,log,auditlog,status:403,msg:'Uploaded file size too
large',id:'960342',severity:'4',setvar:'tx.msg=%{rule.msg}',setvar:tx.anomaly_score=+5,setvar:tx.policy_score=+1,setvar:tx.%{rule.id}-POLICY/SIZE_LIMIT-%{matched_var_name}=%{matched_var}"

However, it doesn't block anything when i uplaod  a 2 MB file.

Is this rule work?

Or i have anything wrong?

Please help!!!!!

Thanks a lot!!!

Here are my logging:

modsecuirty audit log:

--dd60e76b-A--
[28/May/2010:02:16:05 +0000] Fuaas38AAAEAAGKWBNgAAAAB 192.168.185.75 4580
192.168.51.111 7700
--dd60e76b-B--
POST <mysite>/merchanteditgeneral.do HTTP/1.1
Host: 192.168.51.111:7700
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.9)
Gecko/20100315 Firefox/3.5.9 ( .NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: <mysite>/merchanteditgeneral.prepare?merchantID=123
Cookie: JSESSIONID=00004I1ku6XOBJX5mhowCIhnURk:110esvoc9
Content-Type: multipart/form-data;
boundary=---------------------------1766167231251
Content-Length: 2165009

--dd60e76b-F--
HTTP/1.1 413 Request Entity Too Large
Last-Modified: Tue, 25 May 2010 03:04:05 GMT
ETag: "35e40-1de3-69001340"
Accept-Ranges: bytes
Content-Length: 7651
Connection: close
Content-Type: text/html

--dd60e76b-H--
Message: Request body (Content-Length) is larger than the configured limit
(131072).
Stopwatch: 1275012965636787 2317 (- - -)
Producer: ModSecurity for Apache/2.5.11 (http://www.modsecurity.org/);
core
ruleset/2.0.3.
Server: Apache/2.0.63 (Unix) mod_ssl/2.0.63 OpenSSL/0.9.8m

--dd60e76b-K--
SecAction
"phase:1,status:403,t:none,pass,nolog,initcol:global=global,initcol:ip=%{remote_addr}"

--dd60e76b-Z--

modsecuirty debug log:

[28/May/2010:02:30:56 +0000] [192.168.11.111
/sid#9ad0668][rid#b0cde88][/Bank/secure/Merchant/merchantedit/merchanteditgeneral.do][1]
 Request body (Content-Length) is larger than the configured limit
(131072).

Error log:

[Fri May 28 02:30:56 2010] [error] [client 192.168.11.72] ModSecurity:
Request body (Content-Length) is larger than the configured limit
(131072).
[hostname "192.168.11.111"] [uri "<mysite>/merchanteditgeneral.do"]
[unique_id "TAdbtn8AAAEAAGLMUroAAAAN"]

This e-mail is intended solely for the addressee.  If you have received
this e-mail in error, please notify the sender by reply e-mail and
immediately delete it from your system.



users@httpd] Webdav - Files sent twice with <limit PUT>
Hi all,

I have a server running apache with mod_dav enabled.
I try to set up a directory where only valid users can put files.
Anonymous
can get these files.
I have an issue with this configuration. Below is the behavior when a
valid
user put a file on the server :
- file is uploaded to server
- server ask login/password
- file is uploaded to server again.

Does anyone have an idea why file is put twice ? How can I fix it ?

Please find my webdav configuration below:
    <IfModule mod_dav.c>
      DavLockDB /local/var/apache2/DavLock
      <Directory "/local/var/www/share">
        DAV On
        DavMinTimeout 300
        Options Indexes MultiViews FollowSymlinks
        AllowOverride None
        Order allow,deny
        Allow from all
        AuthName "Company"
        AuthType Basic
        AuthBasicProvider ldap
        AuthLDAPURL "ldap://
ldap.company.com:389/ou=users,dc=company,dc=com?uid"
        AddDefaultCharset utf-8
        AuthzLDAPAuthoritative off
        <Limit POST PUT DELETE PROPPATCH MKCOL COPY MOVE LOCK
UNLOCK>
          Require valid-user
        </Limit>
      </Directory>
  </IfModule>

Kind Regards !

Remi


Limit total number of columns for multiget_slice
Hello, I want to know if the following is possible:

I want to query multiple keys in a column family.  I want to limit the
total # of columns returned.

The number of columns for a given key can be anywhere from 0 to several
million.

For example I want to get 50 columns total from keys A & C.  If A and
C have 40 columns each, I want the 40 columns from A and the first 10
columns from C.

CF
{
                A:            Key
                {
                                Col1 {..}
                                Col2 {..}
...
                }

B:            Key
                {
                                Col1 {..}
                                Col2 {..}
                                ...
                }

C:            Key
                {
                                Col1 {..}
                                Col2 {..}
                                ...
                }

D:            Key
                {
                                Col1 {..}
                                Col2 {..}
                                ...
                }
}