On Jul 14, 2009, at 9:13 PM, Jim Starkey wrote:
> A gold star for Lars!
>
Thanks! :)
>
> Lars-Erik Bjørk wrote:
>> Hi all,
>>
>> I have looked at bug#46083 (falcon_pagesize2K fails in
>> IndexRootPage::addIndexEntry() during recovery) today, and I have
>> debugged the relevant parts using the log and tablespace files
>> provided by Olav.
>>
>> This is what seems to happen:
>>
>> We try to add a node in IndexRootPage::addIndexEntry. Here is a
>> piece of the code:
>>
>> <snip>
>>
>> /* If the node fits on page, we're done */
>>
>> AddNodeResult result;
>>
>> for (;;)
>> {
>> result = page->addNode(dbb, key, recordNumber);
>> <snip>
>>
>> The node does not fit on the page, and we are told to split the
>> page at the middle.
>>
>> In IndexPage::splitIndexPageMiddle, we search for a node to split
>> on. We want to chose a node
>> where the prefix compression is poor.
>>
>> <snip>
>> for (; node.node < pageEnd; node.getNext(bucketEnd))
>> {
>> //int l =
>> node.expandKey(key);
>>
>> if (node.offset || node.length)
>> {
>> if (node.nextNode > midpoint)
>> break;
>>
>> chain = node.node;
>> }
>>
>> }
>> <snip>
>>
>>
>> Then we check to make sure that this is not the last node on the
>> page, and if it is, we split on the previous node (prevNode)
>> (I sent a mail about the calculation of the previous node some
>> weeks ago, unrelated to this):
>>
>> // The split node should never be the last node on the page
>> if ((UCHAR*) node.nextNode > (UCHAR*) this + length)
>> node = prevNode;
>>
>> What I see in the debugger is that (UCHAR*) node.nextNode ==
>> (UCHAR*) this + length (note *equal*). To my understanding, this
>> *is* the last node on the page, and the condition should have been
>> '>=' instead of just '>'. This means that we in this case split on
>> the last node instead of the second to last node. On the next
>> iteration we will navigate to this page once again, and try to
>> insert into it. This page is still full (same length as the last
>> time, we split on the END_BUCKET node I assume), and we will split
>> it again. This goes on ... and on (I have looped through 14 of the
>> iterations).
>>
>> If I change this condition from '>' to '>=' and also actually
>> update prevNode in the loop showed earlier, so that it does not
>> refer to the first node on the page, the node will be inserted on
>> the second iteration, after the first split. The recovery still
>> segfaults, but on a different place that I have no knowledge of,
>> further down the recovery process :)
>>
>> Does this sound like a plausible theory?
>>
>> /Lars-Erik
>>
>>
>>
>>
>
>
> --
> Jim Starkey
> President, NimbusDB, Inc.
> 978 526-1376
>
>
> --
> Falcon Storage Engine Mailing List
> For list archives: http://lists.mysql.com/falcon
> To unsubscribe: http://lists.mysql.com/falcon?unsub=1
>