Overall File System and Data Structure
#1
Posted 09 December 2011 - 11:19 AM
I'll copy over some posts from the filenaming thread to kick off the discussion.
EDIT ... on second thought, it turns out I don't know how to copy topics over from another thread . Dave, if you can bring over the last two or three posts from the file extension thread it would kick us off.
#2
Posted 09 December 2011 - 11:38 AM
The files and folders are really just one element of the overall storage plan for the data that openrails needs. For example, it is technically possible to have all the data for a route in a single file. But MSTS choose not to for various reasons which may or may not be valid. Hopefully this discussion will disect and challenge that rationale along with the rest of the folders etc that are defined.
Exactly where the data set breaks out into separate files is a core design decision and how those files are organized is a core design decision.
Lets make the following out of scope for this discussion to try to keep it manageable.
- the exact internal format of the datafiles
- the exact names and extensions used for the various files and folders
#3
Posted 09 December 2011 - 11:50 AM
Is having individual OR vehicle assets* in a single folder structure (akin to MSTS) a bad thing?
(*Vehicle = MPU and rollingstock.)
Let me put my hand up and say, "NO, it's a good thing."
Why?
Because it's self-contained, in most cases, and it's easy to package and redistribute the asset.
Cheers Bazza
Edited.
#4
Posted 09 December 2011 - 12:05 PM
1. Major Pain - Information needed for activities is spread all over - consist files, traffic files, services, etc. It makes it hard to distribute activities. And hard to remove unwanted activities.
2. Minor Pain - Scenery shape files have their textures and mesh files split over two folders - a minor pain for adding and removing scenery items to the route
#5
Posted 09 December 2011 - 12:08 PM
We need to consider an asset author UID system, with the UID prefixing the main file names associated with an asset. The identifying UID could be alphabetical or numerical, or a combination of both.
As you know, I use the prefix BZZ to identify my assets, so if I produce an asset, such as a Mikado, even in it's most simple form, my UID differentiates my asset from any other, ie BZZ_Mikado. This is also a case for assets from individual authors be placed in a folder structure under the author's UID, thence all future assets go into that author's main folder.
Trainsets>
UID_BZZA>sub-folders>>
>>DRGWMikado486>sub-folders>>>
>>RGSMikado384>sub-folders>>>
>>NZRJA1271>sub-folders>>>
Cheers Bazza
#6
Posted 09 December 2011 - 12:11 PM
Quote
Whole heartedly agree....
Major point....don't let this mish-mash approach affect OR, whether it's activities, routes or assets.
Cheers Bazza
#7
Posted 09 December 2011 - 02:44 PM
wacampbell, on 09 December 2011 - 11:38 AM, said:
The files and folders are really just one element of the overall storage plan for the data that openrails needs. For example, it is technically possible to have all the data for a route in a single file. But MSTS choose not to for various reasons which may or may not be valid. Hopefully this discussion will disect and challenge that rationale along with the rest of the folders etc that are defined.
Exactly where the data set breaks out into separate files is a core design decision and how those files are organized is a core design decision.
There is a lot to like and dislike about the storage plan implemented for MSTS. The core issue is the MSTS failed to adequately anticipate the impact of scale. Everything works very well if you have a few routes and a limited number of trainset/consists/etc. It breaks down as these scale. Kuji clearly didn't anticipate scales as MSTS fails to load because of too much active data (routes/trainsets/etc),
TrainStore, among others, attempted to solve this problem by creating a passive secondary store for MSTS stuff, thereby reducing the amount of active data (route/trainsets/etc), In essence it was a data management solution to a design shortcoming. TrainStore introduces its own issues, such as keeping everything in synch and leaving things in unplayable states because of user error.
RW attempted to solve the same problem by using Steam to "verify" the basic software content & folders, but introduced an annoying side effect. Customization gets overwritten unless special efforts were made to segregate the customizations.
What I Like about MSTS storage plan
- easy to understand, its just folders & files like on any Windows PC
- can customize to my individual preferences
- can use generally available tools to edit (Notepad, etc)
- segregation allows for problems/mistakes to be isolated
- easy to add & delete stuff (no fancy installers/uninstallers necessary, just copy & paste or delete)
What I DISLike about MSTS storage plan
- difficult to manage at scale, lots of time & effort
- must use other's conventions (folder name, file name, etc) to share stuff (Trainset/Activities/Paths)
- lots of repetitious stuff (wav, ace, etc) that needs to be managed
- clean up (delete) is impossible unless you understand the dependencies
#8
Posted 09 December 2011 - 03:20 PM
If you allow freedom to choose any kind of name for wagons, engines etc., there is no way you can get automatic tools to sort out what you've got.
A good example is sound and cab-files. Obviously, if you download an engine you want it complete with cab and sound. That means all these files have to be included with all engines, but that causes an awfull amount of duplication. Cross-reference to files from other engines, or central stored files, can only be done if there is a strict naming convention to which EVERYONE has to stick - almost impossible. No easy way out there.
One possible solution could be to have 'personal' fields in files which can be set by the user using special tooling. For instance, when downloading some rolling-stock the user can set these fields to what he (or she!) thinks is should be called, and these names will then be shown in consists editors etc., without the need to change the actual filenames. It will require some special 'explorer' windows to actually show these names rather than the actual filenames, but that can be done. If not set, the actual filename will be used. These fields should not be set in download items.
Consist editors would use the personal name, but would have an 'export' option which converts this to the original filenames, and can then easy select these files for transfer.
On receiving, an 'import' function would do the reverse. There would be no need to change the filenames and yet anyone can use his own naming conventions.
The same could be done for sound-files, cab-files and the like. It would not be very useful for normal static objects.
As for another point made earlier re. consists / activities / rolling stock : the problems of sharing and deleting etc. have little or nothing to do with the fact that the files are stored in different directories. Put them all in one directory and you will have the same problem - it is the fact that you cannot 'see' what's in them which is the root of all problems.
There is a lot to say for keeping different files in different directories. At least when looking in the activities directory, for instance, you can immediately see which activities you have. Put them in the same directory as your rolling stock and you 'loose' them immediately.
Regards,
Rob Roeterdink
#9
Posted 09 December 2011 - 04:39 PM
Step 1 - open steam folder and find "railworks" folder which i buried under the Steam\steam apps\common

Step 2 - go to the railworks\assets folder

Step 3 - Know I need to go to the railworks\assets\kuji folder

Step 4 - Know I need to go to the railworks\assets\kuji\SimulatorUS folder

Step 4 - Know I need to go to the railworks\assets\kuji\SimulatorUS\RailVehicles folder

Step 5 - Know I need to go to the railworks\assets\kuji\SimulatorUS\RailVehicles\Freight folder

Step 5 - Know I need to go to the railworks\assets\kuji\SimulatorUS\RailVehicles\Freight\2-Bay folder

Step 6 - I finally get to where I want to be :>)

So, a simple file that I want is buried 6 levels deep without much guidance as to where
And now for the route side of the house
Step 1 - Go to the Content folder instead of the Assets folder and open the Routes folder
Step 2 - choose the route I want to work on ?????

Here's the route folder structure underneath

And here's your activity folder, Fun huh?

#10
Posted 09 December 2011 - 05:47 PM
Anyway, one thing I want to speak to is the unfortunate need to occasionally rely on the old ms-dos ..\..\ scheme, something I always thought was highly prone to both error and confusion. Without stepping into the operating system itself, would it be feasible for our data files to support the concept of the percent-name-percent convention (e.g.,%routename%)? If we add something like PathShortcut ( %DonnerPass% = "c:\program files\microsoft games\train simulator\routes\3dtsDonner1" ) then somewhere further down in the file we could do something like %DonnerPass%\Shapes\3dtsnBarn4.s; It reads much better, it's not likely to be as prone to error as is the MS-dos convention, and if the original definition of %DonnerPass% is done outside of the many files, it probably could be in one location and defined once. Whether that once is "global" to the route or "global" to the installation, or "global" to the machine is something I havn't considered but having to define it once is, IMO, a big, big plus. I don't think we should be using junctions via the OS... I don't want to be messing with somebodies PC that way, but if we could do that on our own, I think it would be very well received.
#11
Posted 09 December 2011 - 06:36 PM
Genma Saotome, on 09 December 2011 - 05:47 PM, said:
This was a clever work around developed by the MSTS community so that they could share sounds , cabs and the like among different folders. Its seems there should be a better way.
As Rob already mentioned, there is the fundamental issue. If we use a scheme of shared folders for common itesm, we get issues with missing dependancies and finding all the related file components. If everything is included in the loco package, then its easy to install with no dependancies. But there's a waste of disk storage space. There is also potential performance issues although some clever programming could detect the duplicates and deal properly to prevent wasted resources.
As for the RW directory structure, it is never meant to be worked on directly. Users are meant to add rolling stock and other assets using the menu tools they provide. They are similarly removed via menu. Routes are installed, copied, deleted the same way - all from menus. Some people do indeed go in and hack at the files directly, but there is really no need for most users. But your point is well taken keep it simple and clear at the directory level.
#12
Posted 09 December 2011 - 07:19 PM
I don't think we want to impose a system where the user doesn't have full control over their local content. Of course, this means that they also have the ability to royally screw things up.
I don't think UIDs are going to do us any benefits unless we plan to do some sort of centralized content management system - and I don't see that happening. Without the central control, there really isn't anything to enforce it, so why bother trying? By and large, I think most members of the train sim community want to be respectful (although it's certainly not always the case). So in terms of per-user organization, I think we need to perhaps adopt a convention, but not enforce anything.
As much as I like databases, i don't a think full-blown RDBMS belongs on a user machine for just one program. So we're left with files. We should probably avoid having to index every file to look for keys that might be referenced from inside other files since this leads to scalability issues, so that leaves us with linking the files directly.
In terms of file references, I think we need to keep it as flexible as possible. I propose that file links should all support absolute, relative, and aliased (both named and a default) paths. I don't think absolute paths (i.e. 'C:\My Folder\Something\thefile.abc') will be used except maybe in testing, but I don't see why we should specifically exclude them. Relative paths ('textures\mytexture.dds' or '..\..\common\tree.cfg') will be useful for child content, with the parent directory pathing probably having some occasional valid reason for use. Aliased paths will be a big benefit in flexibility of configuration. As I brought up here, we could use the tilde (~) as a special character to denote an alias. ~\ would be the default (whatever we decide that should be) and then ~[name]\ would resolve to whatever the [name] path was set to. I don't like the concept of asking for a file, and then it looks in one place and then another if it doesn't find it in the first place, and I don't think it's necessary if we aren't planning on having a massive pile of stuff in a global folder.
This would allow people to organize their content however they see fit, but still allow for commonality between users using aliasing. We could offer a set of out-of-the box aliases pointing to folders intended for certain types of content (track, cabviews, sounds, etc.) with the hopes that people will latch on to that. It would be possible to create a sort of package installer that could copy files into the various aliased locations to aid in setup.
For rolling stock, we could have a configurable lists of locations using paths as designated above that will be included when putting together consists. This would probably work for shapes too during route construction (although I haven't looked at the next gen route editor doc to see what direction that's going).
Regarding keeping related content together versus spreading it around, so long as people are going to latch on to other peoples content, I would advocate splitting it up by type rather than by project. What I've proposed above won't preclude either method, but I think we should be encouraging reuse, and having to reference a cabview off in some folder that was originally specific to one locomotive is not the cleanest solution. As long as the content creators play nice and pick unique references, then you could have your BZZ folder underneath several locations that you could park your content in depending upon type.
#13
Posted 10 December 2011 - 02:53 AM
Firstly, the game would have one or more designated "content directories". These would be where content is installed, either by the game or the user, and could include a directory in the program area (for stuff we include) and a directory for the user to install things.
Content within a content directory would be allowed in two formats:
- A directory.
- A zip file.
The zip file would contain a directory structure identical to the directory-style of content and would be treated exactly as if it had been unzipped into the content directory. The purpose of this would be to allow simple development (directory-style) and simple distribution (zip-style).
The content of a package directory would be a pre-defined list of directories (only those used need to be included), each containing a single file for each item of that type. For things that are genuinely related, maybe one extra level of directory could be required; we could also keep something similar to MSTS regarding texture selection with weather/season. In general, though, everything inside the pre-defined directories is under the content author's control - they can create sub-directories to group things if they want.
- All references to other files would know what type of file is expected and would allow only relative paths.
- If the source and reference file type are from the same pre-defined directory, the base of the relative path would be the source file's path, e.g. a track file referencing a track database of "foo.tdb" would use "foo.tdb" from its own directory.
- If the source and reference file type are from different pre-defined directories, the base of the relative path would be the pre-defined directory of the appropriate file type within the package, e.g. a shape referencing a texture of "foo.ace" would use "\Textures\foo.ace" from the package it is within.
- References would be allowed to go up a single directory beyond the pre-defined directory and back down in to a different package, e.g. "..\package_name\foo.ace" would use "\Textures\foo.ace" from package "package_name". The package name would be the directory name (the directory inside the zip for zip-style packages), but we could add a better uniqueness to this (e.g. by having packages have UUIDs and using that) if needed.
Actual files
\UserContentPackage1
\Snow
\Whatever.rdb
\Whatever.rit
\Snow
Virtual files
Routes
\Whatever.rdb
\Whatever.rit
\Snow
\Snow
#14
Posted 10 December 2011 - 06:10 AM
So far I am hearing
- simplify deployment and removal of addons - make it drag and drop to install
- ensure we can support sharing and reuse of ... everything...
- ensure we can manage huge collections of addons ( PS Chris please elaborate - what are the problems now wrt file structure)
- make it easy for content creators to 'code' these links into their files
James said:
I definitely like the sound of this. Saves disk space .. simpler deployment .. and probably faster to parse ( on magnetic hard disks at least ).
James, a few questions to help me understand your concept a little more:
1. What problems of the MSTS file structure does your concept aim to solve?
2. What do you mean by the 'Virtual files' section on your diagram?
3. Could you provide an example of what the directories would look like inside the zip file for someone distributing a new boxcar, for example.
#15
Posted 10 December 2011 - 07:04 AM
wacampbell, on 10 December 2011 - 06:10 AM, said:
Mostly it's about self-containment; whatever I want to distribute, it can be done in a single directory or zip that can be installed with a single copy/save in the right directory - no file overwriting, no file copying scripts (because you can reference other packages) - and removed with a single delete. Each package could contain as much or as little as it wanted - from an entire route with scenery, rolling stock and activities, right down to a package containing only vegetation textures for others to use. Everything would be in this structure, with no special global locations or such like - though you could include a package called "global" with default stuff people can use.
wacampbell, on 10 December 2011 - 06:10 AM, said:
This was an attempt to visualise how the relative file references work; the point being that the package names make up one directory level, but the pre-defined directories don't (as they're implied by the type of reference). E.g. "\UserContentPackage1\Shapes\Whatever.s" could reference another package's shape with "..\UserContentPackage2\Whatever.s" instead of the more verbose "..\..\UserContentPackage2\Shapes\Whatever.s".
wacampbell, on 10 December 2011 - 06:10 AM, said:
\My_Boxcar.zip
\boxcar2.wav
\boxcar3.wav
\boxcarcolor.ace
\boxcarmisc.ace
If I wanted to make a consist in another package use this boxcar, it would reference is as "..\James_Ross_Boxcar_1\boxcar.wag". All the references within would just be a filename - no path components would be needed, as they're all at the top of their respective pre-defined directories. They don't have to be, though; if I put boxcar1.wav, etc., in to "\My_Boxcar.zip\James_Ross_Boxcar_1\Sounds\Boxcar" then boxcar.sms would use "Boxcar\boxcar1.wav", etc..
I'm not a content creator, so I might've forgotten some part you need here - but the idea should be clear.