Search result growth and issues

Over the weekend I experienced an issue where adding a search to the
rsyslog client cookbook rather than using a statically set rsyslog
server hostname (http://friendpaste.com/53XqkjYJllP7unEXsAiCnO) caused
a huge growth in bandwidth
(http://img245.yfrog.com/i/deployerifeth0day.png/) and load on my chef
server. I spoke with Adam and Barry on IRC regarding it and it was
deemed that it was likely something wonky with ferret causing the
indexes to grow, results appending to themselves iirc. By the time I
cleaned things out and switched back to my original code the index size
was 145277.

What I am interested in is figuring out if this is something unusual
with ferret and simply to do my set of circumstances or if this has
happened to others. Additionally if it’s something that I am doing
wrong that caused the problem in the first place. I understand there
are changes coming to search but would like to know how to best use the
current system caveats and all.

Advice, comments, suggestions are appreciated.

Thanks.
-Joe


Name: Joseph A. Williams
Email: joe@joetify.com
Blog: http://www.joeandmotorboat.com/

Anyone have experiences, suggestions, ideas?

Thanks.

-Joe

On Wed, 9 Sep 2009 09:50:21 -0700
Joe Williams joe@joetify.com wrote:

Over the weekend I experienced an issue where adding a search to the
rsyslog client cookbook rather than using a statically set rsyslog
server hostname (Friendpaste - WWRcNlmblMvh) caused
a huge growth in bandwidth
(http://img245.yfrog.com/i/deployerifeth0day.png/) and load on my chef
server. I spoke with Adam and Barry on IRC regarding it and it was
deemed that it was likely something wonky with ferret causing the
indexes to grow, results appending to themselves iirc. By the time I
cleaned things out and switched back to my original code the index
size was 145277.

What I am interested in is figuring out if this is something unusual
with ferret and simply to do my set of circumstances or if this has
happened to others. Additionally if it's something that I am doing
wrong that caused the problem in the first place. I understand there
are changes coming to search but would like to know how to best use
the current system caveats and all.

Advice, comments, suggestions are appreciated.

Thanks.
-Joe

--
Name: Joseph A. Williams
Email: joe@joetify.com
Blog: http://www.joeandmotorboat.com/

It's not unique to you, and it's definitely related to Ferret.

We're removing that entire sub-system (in part because of bugs like
these) in the 0.8 release. In the interim, wiping out the index and
re-building it is your best option.

Adam

On Mon, Sep 14, 2009 at 9:34 AM, Joe Williams joe@joetify.com wrote:

Anyone have experiences, suggestions, ideas?

Thanks.

-Joe

On Wed, 9 Sep 2009 09:50:21 -0700
Joe Williams joe@joetify.com wrote:

Over the weekend I experienced an issue where adding a search to the
rsyslog client cookbook rather than using a statically set rsyslog
server hostname (Friendpaste - WWRcNlmblMvh) caused
a huge growth in bandwidth
(http://img245.yfrog.com/i/deployerifeth0day.png/) and load on my chef
server. I spoke with Adam and Barry on IRC regarding it and it was
deemed that it was likely something wonky with ferret causing the
indexes to grow, results appending to themselves iirc. By the time I
cleaned things out and switched back to my original code the index
size was 145277.

What I am interested in is figuring out if this is something unusual
with ferret and simply to do my set of circumstances or if this has
happened to others. Additionally if it's something that I am doing
wrong that caused the problem in the first place. I understand there
are changes coming to search but would like to know how to best use
the current system caveats and all.

Advice, comments, suggestions are appreciated.

Thanks.
-Joe

--
Name: Joseph A. Williams
Email: joe@joetify.com
Blog: http://www.joeandmotorboat.com/

--
Opscode, Inc.
Adam Jacob, CTO
T: (206) 508-7449 E: adam@opscode.com

Thanks for the response Adam. Glad to hear things are moving along for
0.8.

-Joe

On Mon, 14 Sep 2009 09:59:05 -0700
Adam Jacob adam@opscode.com wrote:

It's not unique to you, and it's definitely related to Ferret.

We're removing that entire sub-system (in part because of bugs like
these) in the 0.8 release. In the interim, wiping out the index and
re-building it is your best option.

Adam

On Mon, Sep 14, 2009 at 9:34 AM, Joe Williams joe@joetify.com wrote:

Anyone have experiences, suggestions, ideas?

Thanks.

-Joe

On Wed, 9 Sep 2009 09:50:21 -0700
Joe Williams joe@joetify.com wrote:

Over the weekend I experienced an issue where adding a search to
the rsyslog client cookbook rather than using a statically set
rsyslog server hostname
(Friendpaste - WWRcNlmblMvh) caused a huge
growth in bandwidth
(http://img245.yfrog.com/i/deployerifeth0day.png/) and load on my
chef server. I spoke with Adam and Barry on IRC regarding it and
it was deemed that it was likely something wonky with ferret
causing the indexes to grow, results appending to themselves iirc.
By the time I cleaned things out and switched back to my original
code the index size was 145277.

What I am interested in is figuring out if this is something
unusual with ferret and simply to do my set of circumstances or if
this has happened to others. Additionally if it's something that I
am doing wrong that caused the problem in the first place. I
understand there are changes coming to search but would like to
know how to best use the current system caveats and all.

Advice, comments, suggestions are appreciated.

Thanks.
-Joe

--
Name: Joseph A. Williams
Email: joe@joetify.com
Blog: http://www.joeandmotorboat.com/

--
Name: Joseph A. Williams
Email: joe@joetify.com
Blog: http://www.joeandmotorboat.com/

Yea, the chef project I worked on a few months ago (in the 0.6.x era)
used the REST API for search to find different types of infrastructure
and used those results to configure others. Ferret's duplicate docs
forced me to dedupe in code (PITA). Additionally, I ran a periodic job
that eviscerated and rebuilt the index. At the time, the only way to
rebuild it was "converge everything" (that was the fujinese IIRC, in
English: run chef-client on all of the nodes). I don't know if it's
related to doc versioning in couch or what but that ferret shite was
really irritating. I would think that solr's faceted search is closer to
what people want but I'm kinduv out of the chef loop ATM.

Joe Williams wrote:

Anyone have experiences, suggestions, ideas?

Thanks.

-Joe

On Wed, 9 Sep 2009 09:50:21 -0700
Joe Williams joe@joetify.com wrote:

Over the weekend I experienced an issue where adding a search to the
rsyslog client cookbook rather than using a statically set rsyslog
server hostname (Friendpaste - WWRcNlmblMvh) caused
a huge growth in bandwidth
(http://img245.yfrog.com/i/deployerifeth0day.png/) and load on my chef
server. I spoke with Adam and Barry on IRC regarding it and it was
deemed that it was likely something wonky with ferret causing the
indexes to grow, results appending to themselves iirc. By the time I
cleaned things out and switched back to my original code the index
size was 145277.

What I am interested in is figuring out if this is something unusual
with ferret and simply to do my set of circumstances or if this has
happened to others. Additionally if it's something that I am doing
wrong that caused the problem in the first place. I understand there
are changes coming to search but would like to know how to best use
the current system caveats and all.

Advice, comments, suggestions are appreciated.

Thanks.
-Joe

--
Ian Kallen
blog: What's That Noise?! [Ian Kallen's Weblog]
tweetz: http://twitter.com/spidaman
vox: 925.385.8426