The structure of a postgres database

Since my work at Penguin was primarily system work, I’ve not had to work with databases much for quite a while. Which is really a pity, since I love the elegance and simplicity of the interface, and the strength of the pre-and-post condition constraints. Anyone who has managed to push my Database hot button knows that my Database of choice is PostgreSQL. It is opensource to the core. The developers treated transactions processing as baseline, no a feature to be added later. It supports nested SQL statements (the S ins SQL is for “Structured”). I am happy to be plugging away at a Postgres Database again.

Part of my current project entails porting the backing store for an application from Oracle to Postgres. There is a nuke script that attempts to remove all objects from a database that have a certain prefix. I will change here to protect the guilty. Let’s pretend that my database is to hold overly emotional music references, and thus all of the tables, views, triggers, and so on are prefixed with ‘EMO’. The oracle script does this:

declare
sqlstmt varchar(200);
varowner varchar(50);
TYPE cur_type IS REF CURSOR;

objectcursor cur_type;

Begin
select USER into varowner from global_name;

open objectcursor for
select ‘drop ‘ || object_type || ‘ ‘ || owner || ‘.’ || object_name
|| decode(object_type,’TABLE’, ‘ cascade constraints’)
from all_objects
where owner = varowner
and (object_name like ‘EMO%’ or object_name like ‘%STAT%PROC’)
and object_type in (‘TABLE’, ‘VIEW’, ‘SEQUENCE’, ‘SYNONYM’,
‘FUNCTION’,’PROCEDURE’, ‘PACKAGE’);

Loop
fetch objectcursor into sqlstmt;
exit when objectcursor%NOTFOUND;
execute immediate sqlstmt;
End Loop;
Close objectcursor;
End;

But tring to port this directly to postgres gets an error right at the beginning:

ERROR: syntax error at or near “varchar” LINE 13: sqlstmt varchar(200);

So I’ll bring this whole thing into a function instead. At then end of my script, I will just drop the function. This has the advantage that I can check the format of my code without having to execute it each time.

When working in an an unfamiliar language (an I’ve never really done PG\PLSQL before) I apply a simple approach. First, I get a “Hello, World” type program working. In this case, I used an article from ONLamp.com and got that to run. Next I strip away everything that is context specific until I get a piece of code that is minimal but syntactically correct. It looks like this:

CREATE OR REPLACE FUNCTION dbemo_nuke_postgres ()
RETURNS integer AS $$

declare

BEGIN
return 1;

End;

$$ LANGUAGE plpgsql;

Note that I modified the prefix so that it won’t Hit the match for EMO% in the like statement I am going to need later. Not sure what would happen if I dropped my function as I ran it. Most likely it would work fine, but why chance it?

Now I start adding in parts of the original function piece by piece. First step is to put in the declares clause. The postgres porting guide says I don’t need a cursor. I quickly learn that this goes between the function declaration and the BEGIN statement:

CREATE OR REPLACE FUNCTION dbemo_nuke_postgres ()
RETURNS integer AS $$

declare
sqlstmt varchar(200);
varowner varchar(50);
objectcursor RECORD;

BEGIN
return 1;

End;

$$ LANGUAGE plpgsql;

This is saved in the file EMO_nuke_postgres_part.sql. I run this from the command prompt:

adyoung@adyoung-laptop$ psql -f EMO_nuke_postgres_part.sql
CREATE FUNCTION

And we’re off. Now to the Oracle specifics.

select USER into varowner from global_name;

This command seems to be asking for info regarding the current user.

select ‘drop ‘ || object_type || ‘ ‘ || owner || ‘.’ || object_name
|| decode(object_type,’TABLE’, ‘ cascade constraints’)
from all_objects
where owner = varowner
and (object_name like ‘VPX%’ or object_name like ‘%STAT%PROC’)
and object_type in (‘TABLE’, ‘VIEW’, ‘SEQUENCE’, ‘SYNONYM’,
‘FUNCTION’,’PROCEDURE’, ‘PACKAGE’)

This code is dynamically creating a drop statement for various objects in the schema. Where is this information stored in postgresql? Let’s take a look:

adyoung@adyoung-laptop$ psql
Welcome to psql 8.2.6, the PostgreSQL interactive terminal.

Type: \copyright for distribution \h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon \q to quit
adyoung=# SELECT * from pg_
pg_aggregate pg_group pg_am pg_index pg_amop pg_indexes pg_amproc pg_inherits pg_attrdef pg_language pg_attribute pg_largeobject pg_authid pg_listener pg_auth_members pg_locks pg_autovacuum pg_namespace pg_cast pg_opclass pg_catalog. pg_operator pg_class pg_pltemplate pg_constraint pg_prepared_statements pg_conversion pg_prepared_xacts pg_cursors pg_proc pg_database pg_rewrite pg_depend pg_roles pg_description pg_rules adyoung=# SELECT * from pg_ terms
to execute query
pg_settings pg_stats
pg_shadow pg_stat_sys_indexes
pg_shdepend pg_stat_sys_tables
pg_shdescription pg_stat_user_indexes
pg_stat_activity pg_stat_user_tables
pg_stat_all_indexes pg_tables
pg_stat_all_tables pg_tablespace
pg_stat_database pg_temp_1.
pg_statio_all_indexes pg_timezone_abbrevs
pg_statio_all_sequences pg_timezone_names
pg_statio_all_tables pg_toast.
pg_statio_sys_indexes pg_trigger
pg_statio_sys_sequences pg_type
pg_statio_sys_tables pg_user
pg_statio_user_indexes pg_views
pg_statio_user_sequences
pg_statio_user_tables
pg_statistic

I know that metadata is stored in the pg_.* family of objects, so I use statement completion to show me what is there. Nothing is obvious to me yet, so I poke around at pg_tables.

adyoung=# \d pg_tables;
View “pg_catalog.pg_tables”
Column | Type | Modifiers
————-+———+———–
schemaname | name |
tablename | name |
tableowner | name |
tablespace | name |
hasindexes | boolean |
hasrules | boolean |
hastriggers | boolean |
View definition:
SELECT n.nspname AS schemaname, c.relname AS tablename, pg_get_userbyid(c.relowner) AS tableowner, t.spcname AS “tablespace”, c.relhasindex AS hasindexes, c.relhasrules AS hasrules, c.reltriggers > 0 AS hastriggers
FROM pg_class c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace
WHERE c.relkind = ‘r’::”char”;

Ah Cha! It is a view into a relation called pg_class that seems to hold all of the metadata. Let’s take a look at that:

adyoung=# \d pg_class
Table “pg_catalog.pg_class”
Column | Type | Modifiers
—————-+———–+———–
relname | name | not null
relnamespace | oid | not null
reltype | oid | not null
relowner | oid | not null
relam | oid | not null
relfilenode | oid | not null
reltablespace | oid | not null
relpages | integer | not null
reltuples | real | not null
reltoastrelid | oid | not null
reltoastidxid | oid | not null
relhasindex | boolean | not null
relisshared | boolean | not null
relkind | “char” | not null
relnatts | smallint | not null
relchecks | smallint | not null
reltriggers | smallint | not null
relukeys | smallint | not null
relfkeys | smallint | not null
relrefs | smallint | not null
relhasoids | boolean | not null
relhaspkey | boolean | not null
relhasrules | boolean | not null
relhassubclass | boolean | not null
relfrozenxid | xid | not null
relacl | aclitem[] |
reloptions | text[] |
Indexes:
“pg_class_oid_index” UNIQUE, btree (oid)
“pg_class_relname_nsp_index” UNIQUE, btree (relname, relnamespace)

Lots of good stuff here. OK, we now have enough raw information to generate some real knowledge. In Postgres, everything is a class. To find out if something is a table you select only those fields with the right type:

adyoung=# SELECT distinct (relkind) from pg_class ;
relkind
———
S
i
r
t
v

Funny that it isn’t type ‘t’ for table, but that seems to be of type toast:

adyoung=# SELECT relname from pg_class where relkind =’t’ limit (3);
relname
—————-
pg_toast_10757
pg_toast_10762
pg_toast_10767
(3 rows)
OK, our original SQL seems to be creating a statment to then execute to drop for each class. WHat collumns it is selecting. Here’s our where clause.:

where owner = varowner
and (object_name like ‘EMO%’ or object_name like ‘%STAT%PROC’)
and object_type in (‘TABLE’, ‘VIEW’, ‘SEQUENCE’, ‘SYNONYM’,
‘FUNCTION’,’PROCEDURE’, ‘PACKAGE’).

Let’s start with the like clause.

adyoung=# SELECT count(relname) from pg_class where relname like ‘vpx%’ or relname like ‘%STAT%PROC’;
count
——-
228

Looks good. But S, i, r, t v doesn’t seem to cover the array we want. I don’t see any functions. Let me chgeck the pg_proc:

adyoung=# \d pg_proc
Table “pg_catalog.pg_proc”
Column | Type | Modifiers
—————-+———–+———–
proname | name | not null
pronamespace | oid | not null
proowner | oid | not null
prolang | oid | not null
proisagg | boolean | not null
prosecdef | boolean | not null
proisstrict | boolean | not null
proretset | boolean | not null
provolatile | “char” | not null
pronargs | smallint | not null
prorettype | oid | not null
proargtypes | oidvector | not null
proallargtypes | oid[] |
proargmodes | “char”[] |
proargnames | text[] |
prosrc | text |
probin | bytea |
proacl | aclitem[] |
Indexes:
“pg_proc_oid_index” UNIQUE, btree (oid)
“pg_proc_proname_args_nsp_index” UNIQUE, btree (proname, proargtypes, pronamespace)

Not a view. OK:

adyoung=# SELECT proname from pg_proc where proname like ‘vcdb_nuke_postgres’ ;
proname
——————–
vcdb_nuke_postgres
vcdb_nuke_postgres
(2 rows)
We know that my stuff is in there.

Adam Young's Web Log

The Notebook of a Programmer Climber Musician Ex-Soldier Woodworker and a few other things

The structure of a postgres database

Leave a Reply