Tuesday, October 5, 2010

#0045 Ten Essential Rules of File Naming 4-6

Previous:      Asset Management 101 –  Ten Essential Rules of File Naming 1-3


Asset Management 101 – part 4
 In my last post we looked at the first three of the ten essential rules: Let it Speak, Make it Short, Keep it Simple.

These provide the foundation of a good asset naming scheme.  Before we go on, you will notice in this series  I will use the terms file and asset interchangeably.  For the most part, this is true, but sometimes an asset is a collection of files.  For example, a sequence of image files actually is one asset.


Let's look now at rules four through six:


#4 Avoid Special Characters and Control Punctuation

Special characters, like the colon, semi-colon, parenthesis, brackets, asterisks, ampersands, and so forth can and will choke some application you may use now or may need in the future. This may seem like real basic stuff, but it's also stuff that's regularly ignored. Here's a primer on what to do and not do – and why:

Use command-line compatible punctuation

For safety, I suggest keeping all file names friendly for UNIX or LINUX command line work. While most users today interact with file through file browsers built into the operating system or application, advanced users will rely on command line and scripted solutions to speed repetitive tasks. In UNIX, the only foolproof punctuation characters are underscores and periods. Anything else can and will cause problems. Escaping special characters will increase your in-house program development time AND errors will occur in file operations.

Avoid Spaces

The most famous special character to avoid, now widely violated, is the inclusion of spaces w i t h i n f i l e n a m e s. As illustrated, spaces can make things less understandable if overdone. You and I can read my little example, but a text parser will certainly not know what to do with it. While most software today, will tolerate a space in an asset file name well, but it only takes one application to puke and your whole naming convention is in the dumpster. If that application is your in-house back-up script, I pity the fool who loses a fi le with a space in the name.

Two periods maximum

Stick to periods to separate basename, sequence or frame number, and extension. Period. No extra periods. Extra periods will cause chaos with some software that expects the first period to denote the sequence number and the last (or only) to designate the extension. Period.

The hyphen – if you dare

The next one to risk –and I use the word risk with intent – will be the hyphen or short dash. The hyphen is tolerated well most of the time. But be careful, the hyphen in UNIX means command option. So while you can usually use it with safety, it is ill-advised because, like the space, it can break things. Pity, because it looks great in file names.

Use postScript Notation

Other than underscores, the best way to separate words or concepts is what is called Postscript notation. It's now common in scripting languages, and to remind, it goes like this:
nocapitalCapitalCapitalCapitalCapital.ext
In more words, don't capitalize the first word, eliminate spaces, and capitalize words that follow.
Postscript notation is great, but it breaks down when you have initials, and doesn't always read so well, and that's when underscores help out:
rGB_1stDetail_Layer.ext
You'll notice that in this example, the “r” in RGB is lowercase. You can expect that users will violate the rules and make abbreviations like this all caps: “RGB_” at the start of assets unless you make the rule to keep leading abbreviations lower case:
rgb_1stDetail_Layer.ext

Special Emphasis

There is no way to underline text in file names, but special emphasis can be achieved when necessary with doubled underscores. Notice how the word “1st” stands out in the next example.
rgb__1st__Detail_Layer.ext
Hint: I like to set off take numbers (some people call these version or revision numbers, but I will explain why some files have version numbers and some have take numbers later) with this technique, avoiding the letter:
comp__01.ext

#5 Protect Sort Order

Maintain Chronology

Often people will put dates in file names. When doing this, don't put the day ahead of the month or the month ahead of the year. Make sure all file names with dates will sort in date order by always putting them in year, month, day order. If you ever include time, make it follow the date. Never leave out the year because sooner or later, a project will cross over two years. (For short duration projects, you might be tempted to leave off the month, but again, think ahead....) Some examples:
comp__2010_10_14__01.ext
comp_20101014.ext
comp_2010-10-14c.ext
The last is a good example of when you might wisely use hyphens, if you dare. Note the use of double underscores in the first example. Note also that in the third example, the letter “c” follows the date. This is an excellent way to version date-named files, because it is distinctive and gives you 26 revisions with one character. (27 if you make “a” the second version.)

Variations and Passes after take or version

I cannot tell you how often I see labels identifying variations, particularly layers and render passes, before the version or take number. Almost always. Maya's built-in syntax machine does it this way, and, in my humble opinion, it's backward because it makes sorting on revision level break.
A render layer or render pass is always
Why this is important should be obvious if you are a compositor: you want the latest revision of each item. It is much easier for a compositor to go to the folder where the renders are placed, and get all the elements with the latest revision. If, assuming the compositor has been informed a new revision is ready, there are seven layers and only three new elements or passes, it implies that the prior revision is still valid. But if the version number is after the pass name, the compositor has to be more careful to collect all the right elements.
The final reason: it has to do with my next rule:

#6 Rendered Files Must Refer to Their Script

Every file produced in a digital pipeline is either a script, meta data (such as a shot list) or a product file, and every product file is either directly produced, such as a hand-rendered illustration, or indirectly produced, such as script driven renders. Every cg or composite render comes from a script, and the naming of the script and data products generated must always match.
If I have a most vital rule. This is it. Data products must match their script names. Or, more accurately, because an artist's focus should be on producing a new take or revision, the script must be named so that it refers to the product it produces and the product identifies the script used to make it.
Ignorance of this rule is one of the most common cause of chaos. Imagine you've been asked to revise 23 elements, but have no idea which scripts in the shot tree made them. You have to open every script until you find the one that generated the current version of each element.


Next:            
Asset Management 101 – part 5: Essential Rules of File Naming 7- There Can Be Only One

Sunday, October 3, 2010

#0044 Ten Essential Rules of File Naming 1-3

Previous:      Asset Management 101 –  part 2: Understanding File Classification


Asset Management 101
The Ten Essential Rules of File Naming
Ever sit at the computer wondering what to name a file? Who hasn't? In the world of CG-VFX, this moment of decision gives either life or death to your file system organization. Good asset management begins with a good name. The magic of naming things is to remember the basic rule – I Am What I AM.

What does it mean, “I Am What I AM?” For most people, this fundamental question is exactly why there is a moment of indecision about what to name a file – too often a user doen't really understand what a particular file IS. 
 
Remember, in my last article I explained that there are more than 14 broad categories for every asset. So what a file is comes from those 14 plus categories.

Except in some far-off alien world, most files have an extension, which takes care of the file data type. By the way, file data tyoe was deliberately not included in my list of 14 ways to classify an asset, precisely because I consider this an inherent classification. So, since the extension (almost) always provides the essential file data type information, let's continue to ignore file data type. 
 
Now, I'd like to share my 10 Essential Rules of File Naming. Today we look at the first three:

#1 Let it Speak

File or asset names work when they speak to you. Now, how can a name speak? Well, what I mean, is a good name will tell you lots of information about the asset. This is the core and this is the reason we need to consider the 14 classification systems. 
 
For example, often an asset name contains a shot number, some sort of revision number and often a frame number. (Frame or sequence position numbers, are also outside my classification systems, but they are clearly relevant and must be given proper attention to protect them from confusing your applications.)
We will come back to this later. For now, keep in mind that the main function of the asset name is to speed understanding and help bring the most important information forward.

#2 Make it Short

My first principle is a simple one: keep the name as short as possible. The old fashioned 8.3 naming rule was about as useful as used gum on a shoe, but names that are two long can cause problems. The first problem, is that names that are two long can eat-up name memory space in software that limits this sort of thing, which is common in older applications or legacy apps that still have an old-fashioned code base. What happens is that if the names are X length and you have N files, N*X can be too much. I've seen it in some apps' bins and in at least one OS. Another problem, is that a name that is too long could break a database program that assumes a file name of a maximum length for a data field. Too long and the name could get truncated, making it inaccurate or useless, or worse, the program could crash or not accept the entry. Even if the filename length is accepted, the display of the long filename could be a problem if using a spreadsheet or a data form with a fixed length of the name display. So, if designing a name space for your assets, consider your legacy applications' restrictions. (If designing an application, consider your users' name space needs.) The best reason to keep names short is you (and everyone else) will be typing them again, and again and again.

#3 Keep it Simple

Overly elaborate naming conventions suffer from several problems. First and most significant, users will forget the rules or disregard them. Also, an overly elaborate name may be too long, be difficult to read or contain characters that make your OS or applications choke.

While these first three rules may seem basic, they provide a foundation for all the rest.  We'll look at the next set in tomorrow's installment.
 
Next:            Asset Management 101 – part 4: Ten Essential Rules of File Naming 4-6