Sql Coding Style

Lots of discussion over on Coding Style about how to layout conventional block scoped languages. But what to do with SQL?

This is my favored style:

select family.name, person.name, person.age from family, person where person.family''''''ID = family.ID and person.gender = 'M';

Features

Vertically compact

Items in lists are aligned together (eg: columns, tables, predicates)

What style do you use - and why? What common standards are there? -- Bevan Arps


Another example is near the bottom of Sql Flaws:

Capitalize SQL keywords.

Put separate constraints and column names on separate lines.

Indent equally detailed expressions equally.

Use a monospace font.

Right align equally important SQL keywords equally.

SELECT Person.name AS first''''''Name, Family.name AS last''''''Name, Person.age FROM Person, Family WHERE Person.family''''''ID = Family.ID AND Person.gender = 'M' ORDER BY last''''''Name ASC, family''''''ID ASC, first''''''Name ASC, age DESC;

I use SQLServer exclusively and prefer its join syntax, also like to capitalize the operators though I'm not sure why, and table and field names also being with uppercase, so I would do that as

SELECT Family.Name, Person.Name, Person.Age FROM Person INNER JOIN Family ON (Family.ID = Person.Family''''''ID) WHERE Person.Gender = 'M';

so essentially the same formatting style, but can't stand the multiple tables in the from clause and the capitalization makes the tables feel more like classes, so I prefer it.

One nit I have about this is that "inner join" takes up too much room, pushing the stuff over. My eyes are too old to use those tiny micro-fonts that some newbies like (that's gotta takes it toll someday), and thus screen real-estate is important to me. I'd push the join clause over and not indent more than what "SELECT" takes up. I find 4 spaces to also be a good compromise:

SELECT ....per.name AS firstName, ....fam.name AS lastName, ....per.age FROM Person per INNER JOIN Family fam ON (fam.ID = per.familyID) WHERE per.gender = 'M' ....AND fam.categ = 123 ORDER BY lastName, firstName, age DESC

(Dots to prevent Tab Munging) How to handle the indentation of ANDs/OR's is kind of tricky. We don't want them at the same level as the main keywords because they are not, but we don't want to waste too much white-space indenting them heavily, for they are often the biggest part of the statement. Sometimes I only indent them 2 spaces instead of 4. Another compromise may be to indent the SELECT items at 7 spaces to save a line but allowing the first one to be up next to SELECT. But, I don't like using 7 on the rest of the items after SELECT. Thus, I could sum up my indentation rules as follows:

SELECT: 7 spaces if most columns are simple, 4 or 2 spaces if involved (lots of expressions).

WHERE items (AND etc.): 4 for small statements, 2 for large ones.

The rest: 4

Also, I don't put ASC in the order-by clause because it adds unnecessarily clutter in my opinion. Defaults are time- and eye-savers and are usually a good thing.


Why ignore the fact that "," is a keyword/operator?

SELECT ALL Family.Name , Person.Name , Person.Age

FROM Person INNER JOIN Family ON (Person.Family''''''ID = Family.ID)

WHERE Person.Sex = 'M' AND Family.Name = 'Cunningham'

ORDER BY Person.Age , Person.Name

Because I find that approach ugly.

Interestingly, a friend of mine has made extensive use of this approach because it minimises the number of differences highlighted by version control tools when you add extra columns.

Sounds like you are compensating for a fault in the tool rather than readability for readability's sake.

For example, what if the above code was modified to include another column in the result:

SELECT ALL Family.Name , Person.Name , Person.Age , Person.MaritalStatus ...

This shows up as a one line change. If Commas are at the end of the line, it's a two line change.

Of course, with all that said, I still prefer trailing commas. ;-) -- Bevan Arps

I Have This Pattern for C++ code, when a function call has enough arguments that I decide to put them one to a line. I rationalize it as done here, that it minimizes the errors when adding a parameter -- Pete Hardie


I tend to take a simpler approach. I avoid breaking the statement into multiple lines unless necessary and I don't have the patience to align things into columns, much less right justify them. I don't have any problem reading the other styles, so I do not feel it is necessary to force other people to adopt my writing style. Any way, my preferred rules for typing SQL statements are:

Type the SQL statement on a single line. Only add line breaks if the line extends off screen.

The length of a line is determined by the monitor and screen space in use. "80" is not a magic line length number.

If a couple of characters fall off the end of the screen, it is okay.

If line breaks are needed, first add them just before the major SQL commands like FROM, WHERE, ORDER BY, and GROUP BY.

There is no need to use a line break before every SQL command, only where needed to fit on a line.

If a WHERE clause needs a line break, add them before every AND statement.

Use all or nothing for adding line breaks to WHERE clauses. Either put the WHERE clause on a single line or add a line break before every AND.

If a WHERE or AND line needs to be broken down further, add a line break and indent following an = or IN statement.

If a SELECT clause needs a line break, add them before every column name and indent once.

Indentation can be either a TAB or a SPACE, but should be only a single key press.

Use all or nothing for adding line breaks to SELECT clauses. Either put the SELECT clause on a single line or add a line break before every AND.

Examples:

SELECT column1, column2 FROM table WHERE key = 'ID';

SELECT column1, column2 FROM table WHERE key = 'ID';

SELECT column1, column2 FROM table1, table2 WHERE key = 'ID' AND table1.ID = table2.ID

SELECT column1, column2 FROM table1, table2 WHERE key IN (select key from table3 where name like 'SMIT%') AND table1.ID = table2.ID

I prefer to keep SQL statements to as few lines as possible because most of the time, I am interested in the flow of the statements, not the details of any particular statement. The more lines a statement requires, the fewer statements fit on a screen. There is a balance, though. After scanning through a sequence of statements, I will eventually come to a statement of interest. I now want to be able to see the details of that statement without having to scroll to the end of the line and certainly do not want to scroll back and forth to compare different sections. I want to balance two things, the ability to quickly scan a code segment and the ability to see all of the details of a specific statement within the bounds of the screen. Super regimented consistency does not help me meet my goals and sometimes interferes. Finally, if I decide I don't like how a particular SQL statement looks, I can always change it (and change it yet again later).

I understand your viewpoint, but I'm not sure I agree with your conclusions. Many of the times that I'm dealing with reading someone else's SQL, it's a monster statement like the one on Sql Flaws that does the whole shebang in one hit. I guess I'm biased towards having SQL formatted in a consistent way no matter the scale. -- Bevan Arps

Consistency and readability are sometimes at conflict. A consistent format may make some coding patterns easy to read and some hard.


I use the following style:

SELECT f.ID AS fooID, f.name AS fooName, b.ID AS barID, b.name AS barName, b.description FROM foo f JOIN bar b ON (b.fooID = f.ID) WHERE f.ID = 42 AND b.name LIKE '%fiz%' ORDER BY b.ID

If a list has only one member, it can follow the keyword directly, otherwise every entry gets its own line. This way it's clear what gets selected. Having all parts of the where-clause on its separate line enables me to comment out any part of it easily. Continuing in this vain I could put each comma in the select list on its own line as well, but I think that's just plain ugly. -- Aalbert Torsius


I found a cool tool on the web that formats SQL on-line:

If you are stuck with a big blob of unformatted SQL, this may help. (Disclaimer: they may be mining your company's business processes for patents etc. User beware.)


I like this style:

I use the following style:

SELECT f.ID AS fooID ,f.name AS fooName ,b.ID AS barID ,b.name AS barName ,b.description FROM foo AS f INNER JOIN bar b ON b.fooID = f.ID AND other AND other WHERE f.ID = 42 AND b.name LIKE '%fiz%' AND other AND other ORDER BY b.ID ,more ,still more

It's compact, the leading commas are great for maintenance as the list changes (usually the first item in the list is not frequently changed), when there are multiple WHERE clauses, JOIN clauses, ORDER BY terms, GROUP BY terms, the leading commas make it easy to maintain and read. There are no complex column alignments beyond leading tabs, and in simple quick code generation cases, it is easier to skip a leading comma on the first term than to decide whether to add a trailing comma on a term and there isn't any sloppy "remove the last character if it's a comma" logic. - Cade Roux



See original on c2.com