Site hosted by Angelfire.com: Build your free website today!
crash
9 Apr, 12 > 15 Apr, 12
19 Mar, 12 > 25 Mar, 12
12 Mar, 12 > 18 Mar, 12
12 Dec, 11 > 18 Dec, 11
7 Feb, 11 > 13 Feb, 11
30 Nov, 09 > 6 Dec, 09
15 Jun, 09 > 21 Jun, 09
27 Apr, 09 > 3 May, 09
20 Apr, 09 > 26 Apr, 09
2 Feb, 09 > 8 Feb, 09
22 Dec, 08 > 28 Dec, 08
Entries by Topic
All topics
bleach trance private practice
CD images  «
disney lesbian star wars
Experimental christian
finnish gay leopard
free design god
home made Blowjob rpg
Open Source porn
PSX PS1 PS
Ravishankar fish Mystery
shemale country liebe
speed fuck
trash
xxx nude gta 4 guitar
Blog Tools
Edit your Blog
Build a Blog
RSS Feed
View Profile
Sugar and Spice
SomaFM
Snakes and Snails
redump.org
upyachka.ru
Lampas un Zvaigznes
IMDB
Anime-Planet
You are not logged in. Log in
10/03/2012
FILE0015.CHK
Now Playing: Mission Control @SomaFM
Topic: CD images

Restoration of CD Images by Checksums and Size.
(Zem Asfalta Ir Pludmale)


v0.2


#0.0 - Foreword
#1.0 - Importance of multiple samples
#2.0 - Audio tracks
#2.1 - Fixing by comparison
#2.2 - Pattern regeneration
#3.0 - Data tracks
#3.1 - System area
#3.2 - Primary Volume Descriptor
#3.3 - Root Directory Record
#3.4 - Patches
#3.5 - EDC/noEDC
#3.6 - Damaged files
#3.7 - Different versions
#3.8 - Other things to try
#4.0 - Version history



#0.0 - Foreword

It's surprising how little information sometimes could be sufficient to reverse changes in files with certain defined structure, assuming you know this structure well and can make judgment where these changes occured. We'll take a look at CD images here, particularly  PSX images, but similar principles can be applied to more general cases or different data  altogeather.
Definition of problem: there are given sizes and checksums (CRC32) for CD image divided into tracks; one or more tracks of our image do not match to etalon by those given values.


#1.0 - Importance of multiple samples

Very important part of described process is in obtaining multiple samples. While it's not always a necessity, it would ease next steps tremendously, opening up more possibilities, and, could be, this new sample is good right away, thus eliminating a need for further processing at all. Possibly you could just get a CD. Judge time/money. If you restore some images, but it takes a lot of time, maybe wiser choice might have been to part with some cash. Your time has a value, do not forget it. However, if purchasing is not an option, often something can be found online: forums (do not ignore foreign forums, such smaller and isolated communities could actually be most interesting ones), file hosting services (use specific search engines, like filestube.com, filesearch.ru, etc., for those), usenet, irc, p2p (torrents, emule, dc, share, winny,  etc.) - you know the drill.
[Specific source for PSX images is PSP converted data (EBOOT.PBP). You can look for them too -  i'll try to cover process of extraction of such embedded images later in a separate guide.]


#2.0 - Audio tracks

Now, regarding audio tracks. There is little you can do with audio tracks if you do not have multiple samples. This data is just raw audio signal so correction is done by comparison of two samples and elimination of erronous data. Also you should understand what audio offset is and how to compensate it (read guides and forum posts at redump.org; however, do not use psxt001z 'track' command - it's slow and broken; you could use fff in it's stead) and be familiar with CD image basics - how to convert from one format to another, split tracks apart, recreate gaps - such things. Usually there shouldn't be any data in first gap (data->audio), if you see sector headers here or some garbage, try removing it so it would be all clean up until audio kicks in. Most of the time those are just software/hardware artifacts. Also worth mentioning is that usually audio data on PSX CDs should start at hex offset 0x56220 (if you keep tracks with gaps at the beginning, that is - a la redump.org), i.e. right after gap (352800 / 2352 = 150). So, if for some reason you are unable to determine audio offset of image, try just aligning data this way.


#2.1 - Fixing by comparison

So i'll assume now that you do have at least 2 different samples (when offset corrected / aligned similary) of audio tracks, that do not match to given checksum values. Do a binary compare on them to find first difference: 'fc /b "Track 02.001" "Track 02.002" |more', e.g.:

Comparing files Track 27.bin and TRACK 27.BAD
001A72B0: C2 62
001A72B2: 9B 61
001A72DE: 37 C0
001A72DF: 05 04
001A72E6: 90 F8
001E0B44: 1F CF
001E0B45: F1 F0
001E0B4C: 5E 45
001E0B4D: F7 F6
...

Locate 1st returned offset in hex editor. What you'll see most of the time is something similar to this:



Audio data in 2nd image was interpolated (red). Interpolation is most common strategy employed by CD readers to mask audio reading errors. You can tell it took place if values (int16) are close to ones calculated by adding previous and next audio samples (green) and dividing sum by two.
($1cea + $19da)/2 = $1b62
($12e2 + $11e0)/2 = $1261
($0725 + $025d)/2 = $04c1
($025d + $fd94)/2 = $fff8

So you'll cut this part out of affected file and replace it with good part from the other one. To do this i use my own commandline programs reMove and sExtract, but you're free to choose your own. Ther's rarely more than 3 sectors in a row affected, so you can either go for that or determine exact size by offsets, e.g.: it's 001A72B0..001A72E6 for case illustrated above. Afterwards you'll proceed to next difference and so on.


Other cases might be: mute, noise and either positive or negative desync. It's easy to spot mute - part of audio is replaced with silence. Noise looks somewhat similar to interpolation - few bytes would differ here and there, except that those values make no sense: they break audio wave pattern. Desync is when reader skips a little back, thus  adding extra data, or skips ahead, cutting some data out. You can tell, when all data starting from reported offset looks different. Search for this data from 1st file in itself and 2nd file, and vice versa. This way you should be able to determine nature of those changes, e.g.: if you'd find same patter that is in one of files a little back, then drive skipped back here, hence data gets repeated; if you'd find pattern from one of files in other, but little further ahead, then drive skipped forth, jumping over some of data. Seldom you might run into data from older CD readers where similar patterns can be observed, but, not on audio sample level - on sector level. Just the same - apply neccessary corrections to corrupted track fragment and proceed forth.

If you're unable to determine which file has erroneus data but there aren't too many differences, brute-force approach could be applied. Commandline program reCombine would just run through all possible combinations of two files, looking for certain CRC value.

If you could not obtain alternative image to your own, having different tracks to those in need of corrections, still, there is a hope. Sometimes there are common tracks, that could be found in other games / other versions of same game (even demos and such or, if game has multiple CDs, often some of data repeats there) / releases of same game for different regions - check for that - e.g., if you're using redump.org db for reference, get latest .dat and look for track's CRC in it with text editor's search function. Same game might be released for different system, e.g. Saturn, Dreamcast, NeoGeo, etc, check there. When checking different systems / regions, remember that game's title might have changed. Also offset can be different - compare track sizes - if they're somewhat close, get this version anyway and try to locate some data from your track there with hex editor (i use FAR's internal viewer F3 or, to locate larger fragments of data, when few bytes yield too many hits, my own program - fff.). Sometimes fragments of audio would repeat within track itself [restored SLPS-00131's, SLPS-00187's, SLPS-00391's & SLPS-01782's last tracks using parts of themselves as alternate samples]. Sometimes a part of audio data or even whole track is mirrored in other track - search other tracks for hex string from damaged track [in image of SLPS-00340 i had, Track 14 could be deconstructed to fragments of Track 04 and Track 13, except 4 bytes, that would be odd. When Track 14 were recreated again from those fragments, omitting spare bytes, it matched checksum; End of SLPS-00572's Track 14  could be restored from Track 11 ]. It also might be that audio track is mirrored on data track as dummy file [restored last track of SLPS-01441 this way].


#2.2 - Pattern regeneration

Sometimes whole track consist of certain pattern. Most frequent subset of such cases is dummy tracks (all 0x00). They can be generated with psxt001z command 'gen'. It could also be a different pattern, e.g.: all 0xff's. Most likely you'll need to do some programming to recreate such tracks. Sometimes ther's certain pattern only in a specific part of the track, very end most of the time. With some programming it could be restored too. For example, if audio at the very end of last track is cut off because of offset, but last bytes that remain form a constant 0xff pattern, you would first align this track right, filling missing data with 0x00 and then write a program that would expand 0xff further, replacing one byte of 0x00 at a time, checking CRC on each step. It's because data seldom lasts until very end. Few tens of last bytes are usually 0x00.

#3.0 - Data tracks

More dependencies and interconnections in data tracks (read ECMA-130/ECMA-119). This allows flexibility we couldn't have with audio and often running ECC check/fix is all it would take to reverse unwanted changes. You can use CDMage for this, but you'll notice it's slow and not very practical, so, i use program of my own. If you do not have CD programming experience, good place to start is Kris Kaspersky book 'CD Cracking Uncovered', which also includes several code examples for data regeneration from ECC. Same as with audio, though, the more samples you have, the more prepared you are - the better: other CDs from multi CD games, other versions (demos too), other regions - it's all good. Also please note, that i'll be talking about RAW (2352 byte) sectors here, unless specified otherwise.

 

#3.1 - System area

First 16 sectors of CD's user data area (volume space) are reserved - unused on PCs but filled with some specific data on consoles. On PSX ther's license and logo stored in there. There are about 3 - 4 variations of this data for each region. If actual data doesn't match to any of common patterns, it's modified most likely. Such modifications are usually deliberate replacement of logo or removal of license. psxt001z will check system area automatically, though, it has only most common values build in and hence can report false modifications on some occasions. Another program is reLicense, which you can configure and add as many system area checksums as you like, and it would then calculate CRC32 of specified image as if it would have those defined system areas. Replace modified license with good one from some other image.

 

#3.2 - Primary Volume Descriptor

PVD is located right after system area. As this structure contains some text fields, modifications here are usually made for branding. Most common ones i encountered were by Extremegames and R18; once in a while some odd ones, but usually it's one of these two. AFAIK both of those groups mostly operated for Japan region. Modified fields could include: subheader 00 00 09 00 -> 00 00 08 00; System Identifier (@0x20) removed (replaced with spaces); Volume Identifier (@0x40) removed, added or replaced; Volume Creation Date (@0x345) zeroed; '$' symbol (0x24) after Volume Creation Date (@0x355) set to 0x00; text string inserted in some of text fields. Extremegames usually have more fields modified for otherwise good images. So what you'd do is: you'd write a program that tries all combinations with possible modifications removed. Also, don't forget you have to regenerate EDC/ECC on each pass. Volume ID and Date would be the most difficult ones to reverse. Still, if date were removed, you can guess it pretty close and brute force from there. For Volume ID you could try some forms of serial, or look for hints in related CDs (close related games, other games by same developers, etc.). [Images were such modifications were removed include: SLPS-01098, SLPS-01541, SLPS-01697, SLPS-02556, SLPS-03219] R18 images usually have RHP patch applied and Root Directory Record modified, so you'd brute force through PVD combinations in conjunction with those ones [examples: SLPM-86494, SLPM-86505, SLPM-86835, SLPM-86990, SLPM-87287, SLPS-02075,  SLPS-02730, SLPS-02886].


#3.3 - Root Directory Record

Only instances i saw this record moded were R18 releases. They'd usually set File Flags of some or all entries to 0x4. It's never 0x4 on original CDs, so you can easy tell. Reset them back to 0x0 for files and 0x2 for directories. Correct EDC/ECC afterwards.


#3.4 - Patches


For Japan region patches are simple - one/two assembler instructions. There are severa recurring patterns. Interesting to note, however, is that usually they'd differ from ones you'd find in tables posted online (when looking for RHP) and from those in XPS patch files. So, it looks like there was some competition going on and multiple persons cracked same protections at the same time. So, you'd write a program that looks for certain byte pattern (patched), replaces it with other (unpatched), corrects EDC/ECC and checks checksum on whole file. Mostly ther's only one patch applied per image, but in some cases could be 2 or more, so, if you get no hits, it wouldn't hurt to check for such cases too. [some examples, where patches were reversed: SCPS-10140, SLPM-87012, SLPM-87089, SLPM-87216, SLPS-01830, SLPS-01831, SLPS-02561]


#3.5 - EDC/noEDC

This is specific to PSX CDs. Early CDs pressed for this system had EDC fields missing from Form2 sectors (no error control). AFAIR it was mentioned somewhere in SDK as mastering software's error and was corrected later on. Some CDs were released both ways: with those fields missing and added. Also some early CD drives could insert EDC fields by themselves. So it might be that image needs those fields removed or added. Ther's a special command set for this in psxt001z: --zektor / --antizektor.


#3.6 - Damaged files

Streaming media files on PSX lack error correction information and sometimes EDC too, as explained above. So it's not uncommon for them to get damaged on reading. In such cases you could try to replace those files with same files from other sources. CDMage has 'Import File' command for such cases. Though, what i would actually do, is: extract all files from images (with IsoBuster) and compare now stripped down files, lacking any container wrapping, to easy spot any differences, and based on results determine further actions.


#3.7 - Different versions

It could be possible to convert one version of game into another, if sole difference is updated serial number. First try renaming exe in file system and SYSTEM.CNF. If CRC doesn't match then, try renaming onther references to serial. As a last resort you could try to brute force Volume Creation Date. Keep in mind that EDC/ECC fields needs updating after each change.


#3.8 - Other things to try

For CDs having audio tracks, quite often very end of data track gets damaged. Use psxt001z --fix command then (it will regenerate damaged empty data track sectors) and Injector afterwards (--fix command might remove some specific sectors, this program will recreate them).
If you have multiple images but can't tell which part is good in which one, you could try reCombine program. It would just run through all possible combinations.


#4.0 - Version history

v0.1 First draft (audio only) @20120310
v0.2 Sketch on data tracks @20120311



themabus[at]inbox[dot]lv


Posted by themabus at 19:08 EET
Updated: 12/03/2012 18:05 EEST
Post Comment | Permalink | Share This Post
FILE0014.CHK
Now Playing: Mission Control @SomaFM
Topic: CD images
ECMa130 (ECM alternative)
 
 
2 years old entry... Adding some new stuff here soon, so, just moving it here from there, to have everything at the same place. ECMa130 can be downloaded here

Initially made as a proof of concept, while discussing possible raw (a.k.a. GDI; 2352 bytes per sector) Dreamcast GD-ROM image (redump.org, Dumpcast) compression, later shaped to be quite capable utility, that I would find myself using daily, because of it's performance advantage over counterpart.  Idea is taken from Neill Corlett's ECM and developed further, implementing system specific optimizations, improving container format and more (e.g. latest versions makes use of multithreading for slight performance boost on multi-core machines).

In a few words: this program preprocess raw CD images, commonly produced by CD backup applications (CDRWIN (.bin), CloneCD (.img), ImgBurn, Alcohol 120%, etc.), removing reproducible data (such as ECC), reducing stress on later, usually more complex, processing stages (e.g. delta or compression). This operation is completely lossless and is reverted when decoding, restoring images to their original state.

Dreamcast encoding (data from 15 raw GD-ROM images):
  Total size  % from Input  Time (s)
 Input  16814462112 (~15.6 GB)  100 %  0
 ECM  14662611594 (~13.6 GB)  87.2 %  685
 ECMa130  9708644352 (~9.0 GB)  57.7 %  450

Dreamcast compression:
  Total size  % from Input  Time (s)
 Input -> 7-Zip 4.65  8156554700 (~7.5 GB)  48.5 %  4167
 ECM -> 7-Zip 4.65  6717473687 (~6.2 GB)  39.9 %  3521
 ECMa130 -> 7-Zip 4.65  6674638910 (~6.2 GB)  39.6 %  3173


In illustrated case ECMa130 executes notably faster than ECM (4 min.) and removes a lot more data (30%). This further affects next processing step: 7-Zip compression, since it has far less data to analyze and, while size wise LZMA algorithm ran on ECM output catches up with only about 50 MB difference (it would not always be the case with other algorithms), resulting time saved with ECMa on those 15 images is almost 10 minutes. 9 minutes faster than LZMA over unprocessed data and ~1.4 GB saved.

PlayStation encoding (data from 15 raw CD images):
  Total size  % from Input  Time (s)
 Input  3878325696 (~3.6 GB)  100 %  0
 ECM  3498975325 (~3.2 GB)  90.2 %  478
 ECMa130  2996472772 (~2.7 GB)  77.2 %  116

PlayStation compression:
  Total size  % from Input  Time (s)
 Input -> 7-Zip 4.65  1638481212 (~1.5 GB)  42.2 %  902
 ECM -> 7-Zip 4.65  1417162337 (~1.3 GB)  36.5 %  811
 ECMa130 -> 7-Zip 4.65  1406660446 (~1.3 GB)  36.2 %  799

PlayStation ultra (-mx9) compression:
  Total size  % from Input  Time (s)
 Input -> 7-Zip 4.65  1598770703 (~1.4 GB)  41.2 %  1422
 ECM -> 7-Zip 4.65  1386606174 (~1.2 GB)  35.7 %  1282
 ECMa130 -> 7-Zip 4.65  1372467680 (~1.2 GB)  35.3 %  1269


In this case 7-Zip's compression with default parameters executes slightly faster on unprocessed data than ECMa + 7-Zip, but with ECMa output takes up about 6 % less space. On ultra compression ECMa + 7-Zip combo produces best overall results, beating 7z by 37 seconds and saving 6 % of data. ECM is by far the slowest, losing more than 6 minutes to ECMa130 and being outperformed even by LZMA algorithm suggests that there likely are certain issues with ECM's CD-ROM XA sector processing implementation.

All code is original, written in pascal with Cyclic Redundancy Check & Reed–Solomon algorithms in assembler. Compiled with Free Pascal. Through last year tested with over 500 images for various systems (DC, PSX, SS, PCE, PC, etc.).

Posted by themabus at 15:12 EET
Updated: 12/03/2012 00:00 EEST
Post Comment | Permalink | Share This Post

Newer | Latest | Older