Discussion:
[mdb-dev] (no subject)
Yannick Heinrich
2015-12-14 12:34:25 UTC
Permalink
Hello,

I'm currently trying to create a small Go tool that could deal with mdb
files.

I'm currently reading the HACKING file at
https://github.com/brianb/mdbtools/blob/master/HACKING to be able to decode
a Jet4 file for now.

I don't really understand the definition of tdef_pg in the data page
section.

Is it the index of the page within the whole file ? Is it an offset
relative to the beginning of the file ?

I fall on this page speaking about tdef_pg on sourceforge :
http://sourceforge.net/p/mdbtools/mailman/message/3267842/

I created a small program printing all the headers of the data pages and I
did not get the offset mentionned in this page.
Moreover, on this page, a page_id is mentioned but no trace in the HACKING
file.

Could someone give me more information about the tdef_pg pointer and more
generally,
how is a data definition page found from the data page header ?

Yannick
Yannick Heinrich
2016-01-12 12:16:07 UTC
Permalink
Hello again :)

I'm moving forward with my go library and I have arrived at the point where
I need to read the records from the
data pages.
From what I understood by reading the source code of libmdb, all the
secrets are contained in the MSysObjects table.

Is there a good reference or description of the MSysObjects table and
object ? What is the basic approach ?


Regards
Yannick Heinrich
Hello,
I'm currently trying to create a small Go tool that could deal with mdb
files.
I'm currently reading the HACKING file at
https://github.com/brianb/mdbtools/blob/master/HACKING to be able to
decode a Jet4 file for now.
I don't really understand the definition of tdef_pg in the data page
section.
Is it the index of the page within the whole file ? Is it an offset
relative to the beginning of the file ?
http://sourceforge.net/p/mdbtools/mailman/message/3267842/
I created a small program printing all the headers of the data pages and I
did not get the offset mentionned in this page.
Moreover, on this page, a page_id is mentioned but no trace in the HACKING
file.
Could someone give me more information about the tdef_pg pointer and more
generally,
how is a data definition page found from the data page header ?
Yannick
------------------------------------------------------------------------------
_______________________________________________
mdbtools-dev mailing list
https://lists.sourceforge.net/lists/listinfo/mdbtools-dev
Jakob Egger
2016-01-12 21:23:01 UTC
Permalink
MSysObjects just tells you the name, type, etc. of the table. You can ignore it if you just want to read the data.

If you want to read the records, you need information from the tdef page. The tdef pages tell you how what fields are in a table, what types they have, and so on. You need that information to read the data pages.

What does a data page look like? Well, it has a header that tells you what table it belongs to (table id = index of the tdef page). Then it is filled with records, starting from the end. The header stores the offsets of the records.

What do records look like?
Each record has all the fixed length fields in the front (use info from tdef page to parse). All the variable length fields are at the end, starting from the back. You could actually parse variable length fields without the info from the tdef page (at least text columns). The NULL map (which fields are NULL) is also stored at the end of the record.

How do you find data pages for a specific table?
1) Look at the page usage bitmap for the table (complicated)
2) Just loop through all pages in the file and look at the table id (easy, fast enough for all but the largest databases)

Hope this helps a little. Reading MDB files is unfortunately not trivial.

Jakob
Post by Yannick Heinrich
Hello again :)
I'm moving forward with my go library and I have arrived at the point where I need to read the records from the
data pages.
From what I understood by reading the source code of libmdb, all the secrets are contained in the MSysObjects table.
Is there a good reference or description of the MSysObjects table and object ? What is the basic approach ?
Regards
Yannick Heinrich
Hello,
I'm currently trying to create a small Go tool that could deal with mdb files.
I'm currently reading the HACKING file at https://github.com/brianb/mdbtools/blob/master/HACKING <https://github.com/brianb/mdbtools/blob/master/HACKING> to be able to decode a Jet4 file for now.
I don't really understand the definition of tdef_pg in the data page section.
Is it the index of the page within the whole file ? Is it an offset relative to the beginning of the file ?
I fall on this page speaking about tdef_pg on sourceforge : http://sourceforge.net/p/mdbtools/mailman/message/3267842/ <http://sourceforge.net/p/mdbtools/mailman/message/3267842/>
I created a small program printing all the headers of the data pages and I did not get the offset mentionned in this page.
Moreover, on this page, a page_id is mentioned but no trace in the HACKING file.
Could someone give me more information about the tdef_pg pointer and more generally,
how is a data definition page found from the data page header ?
Yannick
------------------------------------------------------------------------------
_______________________________________________
mdbtools-dev mailing list
https://lists.sourceforge.net/lists/listinfo/mdbtools-dev <https://lists.sourceforge.net/lists/listinfo/mdbtools-dev>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
mdbtools-dev mailing list
https://lists.sourceforge.net/lists/listinfo/mdbtools-dev
Yannick Heinrich
2016-01-28 22:34:57 UTC
Permalink
This helps me to understand the process :)

May be you can give me some more explanations about usage maps.

In the HACKING file of the mdv-tools directory, it describes the different
pages type in the file. 0x05 is for "Page Usage Bitmaps".

But how is the link done with the others pages (data and table definition)
? Inside the descriptions of these pages, the document refers to "usage
bit mask".

What is the difference between Page Usage Bitmaps and usage bit mask ?

There is also two types of Page Usage (0x00 and 0x01). Are those types
present after the page usage bitmaps header ?

Regards
Yannick
Post by Jakob Egger
MSysObjects just tells you the name, type, etc. of the table. You can
ignore it if you just want to read the data.
If you want to read the records, you need information from the tdef page.
The tdef pages tell you how what fields are in a table, what types they
have, and so on. You need that information to read the data pages.
What does a data page look like? Well, it has a header that tells you what
table it belongs to (table id = index of the tdef page). Then it is filled
with records, starting from the end. The header stores the offsets of the
records.
What do records look like?
Each record has all the fixed length fields in the front (use info from
tdef page to parse). All the variable length fields are at the end,
starting from the back. You could actually parse variable length fields
without the info from the tdef page (at least text columns). The NULL map
(which fields are NULL) is also stored at the end of the record.
How do you find data pages for a specific table?
1) Look at the page usage bitmap for the table (complicated)
2) Just loop through all pages in the file and look at the table id (easy,
fast enough for all but the largest databases)
Hope this helps a little. Reading MDB files is unfortunately not trivial.
Jakob
Hello again :)
I'm moving forward with my go library and I have arrived at the point
where I need to read the records from the
data pages.
From what I understood by reading the source code of libmdb, all the
secrets are contained in the MSysObjects table.
Is there a good reference or description of the MSysObjects table and
object ? What is the basic approach ?
Regards
Yannick Heinrich
Post by Yannick Heinrich
Hello,
I'm currently trying to create a small Go tool that could deal with mdb files.
I'm currently reading the HACKING file at
https://github.com/brianb/mdbtools/blob/master/HACKING to be able to
decode a Jet4 file for now.
I don't really understand the definition of tdef_pg in the data page section.
Is it the index of the page within the whole file ? Is it an offset
relative to the beginning of the file ?
http://sourceforge.net/p/mdbtools/mailman/message/3267842/
I created a small program printing all the headers of the data pages and
I did not get the offset mentionned in this page.
Moreover, on this page, a page_id is mentioned but no trace in the HACKING file.
Could someone give me more information about the tdef_pg pointer and more generally,
how is a data definition page found from the data page header ?
Yannick
------------------------------------------------------------------------------
_______________________________________________
mdbtools-dev mailing list
https://lists.sourceforge.net/lists/listinfo/mdbtools-dev
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
mdbtools-dev mailing list
https://lists.sourceforge.net/lists/listinfo/mdbtools-dev
Jakob Egger
2016-01-30 10:27:39 UTC
Permalink
First of all: Page Usage Maps are only used for improving performance, you don't need to use them if you just want to read data.

Every table has two main page usage bitmaps: the bitmap with all the used pages, and the bitmap with free pages.

The used_pages bitmap tells you which pages in the database belong to that table. You can find the same info by just looping through all pages and checking the table id in the page header. Anyway, here are two simple examples: If page index 6 belongs to a table, then the page usage bitmap would be 01000000 in binary or the single byte 0x80. If pages 6 and 9 belong to a table, the page usage bitmap would be two bytes: 0x80 0x02.

The two different types of bitmaps are for small and for large databases. If a bitmap for the full database fits on a single page (less than 32736 pages), then type 0 bitmaps are used. Otherwise type 1 bitmaps are used, meaning that the page usage bitmap is spread over multiple pages (see hacking file for details)

The free space usage map is only necessary if you want to support writing: it tells Access on which pages it can find space to write a new row.

Then there are usage maps related to the indexes, and apparently there are also usage maps related to columns at the end of the tdef pages... I don't know how those work, I assume those are also some kinds of optimisations.

I hope this helps, but as I said before, I'd recommend ignoring the page usage bitmaps at first. In my experience, they are also sometimes corrupted, so sometimes you need to fall back to scanning the full database anyway.

Jakob
Post by Yannick Heinrich
This helps me to understand the process :)
May be you can give me some more explanations about usage maps.
In the HACKING file of the mdv-tools directory, it describes the different pages type in the file. 0x05 is for "Page Usage Bitmaps".
But how is the link done with the others pages (data and table definition) ? Inside the descriptions of these pages, the document refers to "usage bit mask".
What is the difference between Page Usage Bitmaps and usage bit mask ?
There is also two types of Page Usage (0x00 and 0x01). Are those types present after the page usage bitmaps header ?
Regards
Yannick
MSysObjects just tells you the name, type, etc. of the table. You can ignore it if you just want to read the data.
If you want to read the records, you need information from the tdef page. The tdef pages tell you how what fields are in a table, what types they have, and so on. You need that information to read the data pages.
What does a data page look like? Well, it has a header that tells you what table it belongs to (table id = index of the tdef page). Then it is filled with records, starting from the end. The header stores the offsets of the records.
What do records look like?
Each record has all the fixed length fields in the front (use info from tdef page to parse). All the variable length fields are at the end, starting from the back. You could actually parse variable length fields without the info from the tdef page (at least text columns). The NULL map (which fields are NULL) is also stored at the end of the record.
How do you find data pages for a specific table?
1) Look at the page usage bitmap for the table (complicated)
2) Just loop through all pages in the file and look at the table id (easy, fast enough for all but the largest databases)
Hope this helps a little. Reading MDB files is unfortunately not trivial.
Jakob
Post by Yannick Heinrich
Hello again :)
I'm moving forward with my go library and I have arrived at the point where I need to read the records from the
data pages.
From what I understood by reading the source code of libmdb, all the secrets are contained in the MSysObjects table.
Is there a good reference or description of the MSysObjects table and object ? What is the basic approach ?
Regards
Yannick Heinrich
Hello,
I'm currently trying to create a small Go tool that could deal with mdb files.
I'm currently reading the HACKING file at https://github.com/brianb/mdbtools/blob/master/HACKING <https://github.com/brianb/mdbtools/blob/master/HACKING> to be able to decode a Jet4 file for now.
I don't really understand the definition of tdef_pg in the data page section.
Is it the index of the page within the whole file ? Is it an offset relative to the beginning of the file ?
I fall on this page speaking about tdef_pg on sourceforge : http://sourceforge.net/p/mdbtools/mailman/message/3267842/ <http://sourceforge.net/p/mdbtools/mailman/message/3267842/>
I created a small program printing all the headers of the data pages and I did not get the offset mentionned in this page.
Moreover, on this page, a page_id is mentioned but no trace in the HACKING file.
Could someone give me more information about the tdef_pg pointer and more generally,
how is a data definition page found from the data page header ?
Yannick
------------------------------------------------------------------------------
_______________________________________________
mdbtools-dev mailing list
https://lists.sourceforge.net/lists/listinfo/mdbtools-dev <https://lists.sourceforge.net/lists/listinfo/mdbtools-dev>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________>
mdbtools-dev mailing list
https://lists.sourceforge.net/lists/listinfo/mdbtools-dev <https://lists.sourceforge.net/lists/listinfo/mdbtools-dev>
Continue reading on narkive:
Loading...