Monday, April 24, 2006

Daily error Oracle Server 10g(rdbms) + OHS + HTML DB on Linux x86

OHS - Oracle HTTP Server (basically Oracle Managed Apache Httpd)
OPMN - Oracle Process Manager and Notification Server
ONS - Oracle Notification Services

Well, one day I discovered that I had run out of space on my server.
"That's impossible," I said to myself. Of course it was, but impossible things tend to happen and well, as I discovered, OHS had produced 1.5GB logs every week until now. And the error came to my attention only because I ran out of disk space. (The latter is probably my own problem as I don't have the useful habit of reading all my server logs daily. I'd perfer something bit more imaginative.)

This case is noteworthy as it is a standard by-the-book installation onto SLES9 with 10gR2 rdbms, OHS and HTML DB upgraded to 2.0. So this is the default installation that gives us such a situation.

The log file being filled was $ORACLE_HTTP_HOME/opmn/logs/ons.log

-rw-r--r-- 1 oracle oinstall 786M 2006-04-24 17:19 ons.log
-rw-r--r-- 1 oracle oinstall 1.5G 2006-04-08 04:00 ons.log.06-04-08_04:00:11
-rw-r--r-- 1 oracle oinstall 1.5G 2006-04-14 15:28 ons.log.06-04-14_15:28:31
-rw-r--r-- 1 oracle oinstall 1.5G 2006-04-21 02:59 ons.log.06-04-21_02:59:47


The message being written hundreds and thousands of times over and over was
06/04/24 17:42:02 [4] Local connection 0,127.0.0.1,6113 missing form factor


And "netstat -na" gives me a throng of queries waiting for the better world...
tcp 0 0 127.0.0.1:6113 127.0.0.1:50201 TIME_WAIT

x times 100 or perhaps 1000. Didn't count, but it is apparent they seem to be origined from port 6113, which is defined as the ONS service under OHS.



So what's the matter?

I have Oracle Server and OHS with HTML DB installed and it appears that Oracle Notification Services is trying (or one of it's clients?) very hard to connect to localhost, but failes because of the listener conflict.

When OHS and databse are installed on the same box, the installer will mistakenly configure identical ONS ports in both homes which creates an operational conflict when both the HTTP ONS and database listener services are running.


Solution

I unsubscribed from ONS with my database listener. To do that, I added the following row to my database listener.ora:
SUBSCRIBE_FOR_NODE_DOWN_EVENT_=OFF


Now "netstat -na" is much more reasonable, but I still cannot start opmnctl.


So I went ahead and changed notification server ports by +1
$ORACLE_HTTP_HOME/opmn/conf/opmn.xml:



And then "./$ORACLE_HTTP_HOME/opmn/bin/opmnctl startall"
which responded "opmnctl: starting opmn and all managed processes..."

The ons.log said now only "06/04/25 10:23:38 [4] ONS server initiated".

So my OHS and access to HTML DB was up and running once again.

PS
When my database couldn't write anymore to the hard disk, I couldn't even log in with "sqlplus / as sysdba" because that connection needs to write an audit file. Having 0B of free space makes that somewhat troublesome. So I went ahead and deleted one of the massive 1.5GB logfiles and started the database once again as it had shut down because it couldn't expand one tablespace nomore.
I was very relieved to see that it didn't take any recovery as I haven't backed up anything (it's a development database). Saved me a lot of trouble and well.. it showed me that even the smallest dev-base should have a minimal backup at hand. So I'm off to backup some data without delay.

Ignite my anger with your delay
and punishments will come your way..
-- Ayreon "Into the Electric Castle"

Tuesday, February 21, 2006

Hint for desktop-Oracle developers - Insert into table with NOLOGGING

Here's the trick.
I've got a table with 2,2M rows for performance testing purposes, but now I wanted to export the data to a new, partitioned table. So logical would be just to INSERT INTO, right?

Well in my case it wasn't, because the redo that would have been created as a result, would have been too much for my PC (maybe it wouldn't have been, but I didn't want to waste my time). So I searched a bit because I remembered a way to make a "direct-path insert" with no redo log.

The answer was, as always, in the fine documentation of Oracle. LOGGING is the default, as in production you surely wouldn't want to lose any data, but in development there is no need for that kind of strickness.

SQL> alter table partitionedTable nologging;

Table altered.

SQL> insert into partitionedTable select * from noPartitionsWhatsoever nologging;

2198152 rows created.

SQL> alter table partitionedTable logging;

Table altered.

Monday, February 20, 2006

Daily Errorcode

From today on, I'm starting a new tradition. Every time I get an error, I'll post it here and also try to include the solution. Today's error was the "double-Duh type". Read on and you'll get it.


I wanted to do a little backup of my Win desktop's 10gR2 and it gave me the ORA-19809.
Searched a bit and Paul M. answered in the OTN forums:

Flashback Recovery and RMAN are independent from each other, but if you activate Flashback Recovery you'll need more space in Flash Recovery area, because of Flashback logs.

To avoid ORA-19809 you have to periodically free space in Flash Recovery area, and normally this is done by using RMAN :

$ oerr ora 19809
19809, 00000, "limit exceeded for recovery files"
//*Cause: The limit for recovery files specified by the
// DB_RECOVERY_FILE_DEST_SIZE was exceeded.
//*Action:The error is accompanied by 19804. See message 19804 for further
// details.
$ oerr ora 19804
19804, 00000, "cannot reclaim %s bytes disk space from %s limit"
// *Cause: Oracle cannot reclaim disk space of specified bytes from the
// DB_RECOVERY_FILE_DEST_SIZE limit.
// *Action: There are five possible solutions:
// 1) Take frequent backup of recovery area using RMAN.
// 2) Consider changing RMAN retention policy.
// 3) Consider changing RMAN archivelog deletion policy.
// 4) Add disk space and increase DB_RECOVERY_FILE_DEST_SIZE.
// 5) Delete files from recovery area using RMAN.
So now I changed my DB_RECOVERY_FILE_DEST_SIZE from 2G to 4G just to be sure..
Cause I had the space and had no wish to change the retention policy. Tell you the truth, I don't even know what is the default retention policy for Enterprise Edition.

> shutdown

At this point I slapped my forehead and went Duh! - forgot the "immediate".
So I went for a walk around the building..

I had a revelation during the walk - Windows has the Service Manager where I can shut down services explicitly, so I did it. And voila - the server was successfully restarted as it came up automatically.

So now, with 4G of backup space, I started making a full backup - with Enterprise Manager again. Incremental level 0 means it is the basis for level 1 increments - level 0 being the first full backup and Level 1 being the "delta data" from the last Level 0.

RMAN> backup incremental level 0 cumulative device type disk tag '%TAG' database include current controlfile;


Finally if all was successfully completed, I remembered that my db was about 2,3GB so there was no possible way to fit the backup into 2G-s.
Time to slap my forehead again?

Friday, December 30, 2005

Oracle developers switch to Solaris 10 as preferred development environment

So this news from SDN (Sun Developer Network) has come to my attention.
Oracle internal development switches to Solaris 10 (OpenSolaris is released in many milestones, two of them have already passed, but alot of work is also ahead).
I guess these guys really have no difference if to develop on Solaris, Windows, Fedora, SuSE, MacOS X4 tiger or Ubuntu as JDeveloper, Oracle Database and various other tools are readily available on those platforms. I find this kind of development intriguing as it probably is a political decision on behalf of Oracle.

Sunday, November 20, 2005

SS2K/SS2K5 vs Oracle part 1

This post will be in estonian as I wrote it in estonian and although I thought that this blog might be a english one for me. Still, I think when I have the energy, I'll translate it into english. For the record, it's about the MVCC and the locking.


Jutustan teile lukkudest ja versioonimisest.
Et saada aru SSi ja Oraakli lukustuspoliitikast, teeme ühe näite selle kohta läbi.

Olgu meil tabelid inimestega ja nende ostudega. Sisestagem alustuseks sinna mõned väärtused.
Märkus: panin nimele VARCHARi, mitte klassikalise VARCHAR2, et oleks MSSQLis vähem editida, kuid hiljem selgus, et seal pole NUMBER tüüpi ja "REFERENCES inimene" tuleb pisut pikemalt lahti kirjutada.. :P Aga see vahememmo.

martti@oracle1> create table inimene
2 (inimene_id int primary key,
3 nimi varchar(50),
4 saldo number
5 );


Table created.

martti@oracle1> create table ost
2 (ostu_id int primary key,
3 inimene_id references inimene,
4 hind number
5 );

Table created.

martti@oracle1> insert into inimene (inimene_id,nimi) values (1,'Martti');

1 row created.

martti@oracle1> insert into inimene (inimene_id,nimi) values (2,'Pärtel');

1 row created.

martti@oracle1> insert into ost (ostu_id, inimene_id, hind) values (100,1,11000);

1 row created.

martti@oracle1> insert into ost (ostu_id, inimene_id, hind) values (101,1,-27000);

1 row created.

-- Nüüd uuendagem isiku 1 saldoseisu.

martti@oracle1> update inimene set saldo =
2 (select sum(hind) from ost where ost.inimene_id=inimene.inimene_id)
3 where inimene.inimene_id=1;

1 row updated.

-- Olgu meil algseis selline:

martti@oracle1> select * from ost;

OSTU_ID INIMENE_ID HIND
---------- ---------- ----------
100 1 11000
101 1 -27000


martti@oracle1> select * from inimene;

INIMENE_ID NIMI SALDO
---------- -------------------------------------------------- ----------
1 Martti -16000
2 Pärtel


martti@oracle1> commit;

Commit complete.


-- Ok. Seni oli kõik ilus ja lihtne.
-- Võtame nüüd kaks konkureerivat sessiooni.
-- Selguse mõttes on üks oracle1 ja teine oracle2.

-- Pärtel sai millegipärast 19k.
oracle1>insert into ost(ostu_id, inimene_id, hind)
2 values (102,2,19000);

1 row created.

-- Teises sessioonis tuleb ntx teine teller ja vaatab, et oli
-- esimese ülekande valele inimesele omistanud..

oracle2>update ost set inimene_id = 2 where ostu_id = 100;

1 row updated.

-- Teise sessiooni tegelane mõtleb, et saab veel tehtud jama kiirelt
-- heaks teha ja uuendab mõjutatud inimeste saldod üle.

oracle2>update inimene set saldo =
2 (select sum(ost.hind) from ost
3 where ost.inimene_id = inimene.inimene_id)
4 where inimene.inimene_id in (1,2);

2 rows updated.

-- Esimeses sessioonis tahab teller Pärtli tehingu tõttu samuti saldot
-- uuendada, kuid ta ei saa - jääb ootele, kuna teine sessioon on
-- enda poolt uuendatud väljad lukustanud kuni transaktsiooni lõpuni.

oracle1>update inimene set saldo =
2 (select sum(ost.hind) from ost
3 where ost.inimene_id = inimene.inimene_id)
4 where inimene.inimene_id = 2;


-- Teine teller lõpetab transaktsiooni.

oracle2>commit;

Commit complete.

-- Esimene sessioon ütleb meile koheselt luku kadudes nüüd:

oracle1> 1 row updated.

-- Lõpetame ka esimese sessiooni transaktsiooni.

oracle1>commit;

Commit complete.


-- Vaatame tulemusi:

oracle1>select * from ost;

OSTU_ID INIMENE_ID HIND
---------- ---------- ----------
100 2 11000
101 1 -27000
102 2 19000

oracle1>select * from inimene;

INIMENE_ID NIMI SALDO
---------- -------------------------------------------------- ----------
1 Martti -27000
2 Pärtel 19000

-- Mis siis nüüd toimus?
-- Nimelt esimene sessioon kandis Pärtlile 19k üle.
-- Teine sessioon kandis üle 11k Martti->Pärtel.
-- Nüüd mõlemad sessioonid uuendasid hoolega saldosid, kuid lõppeks jäi vale seis,
-- kuna viimasena uuendas esimene sessioon, mis ei näinud teise sessiooni uuendust.
-- See demonstreerib tähtsat käitumuslikku aspekti: mõlemad sessioonid tahavad uuendada
-- samu ridu ja loevad tulemuse mällu. Üks jääb lihtsalt ootama, millal teise transaktsioon
-- on läbi ja uuendab siis mälus olevate väärtustega endale vajaliku rea.

SS2k-s sama asi.
Kasutan lihtsuse mõttes osql-i.

osql -S MARTTI\SS2K -d pubs -E

create table inimene
(inimene_id int primary key,
nimi varchar(50),
saldo bigint
);
go

create table ost
(ostu_id int primary key,
inimene_id int
foreign key references inimene(inimene_id)
ON DELETE NO ACTION,
hind bigint
);
go

insert into inimene (inimene_id,nimi) values (1,'Martti');

insert into inimene (inimene_id,nimi) values (2,'Pärtel');

insert into ost (ostu_id, inimene_id, hind) values (100,1,11000);

insert into ost (ostu_id, inimene_id, hind) values (101,1,-27000);

-- Nüüd uuendagem isiku 1 saldoseisu.

update inimene set saldo =
(select sum(hind) from ost where ost.inimene_id=inimene.inimene_id)
where inimene.inimene_id=1;


-- Olgu meil algseis selline:

1> select * from ost;
2> go
ostu_id inimene_id hind
----------- ----------- --------------------
100 1 11000
101 1 -27000

(2 rows affected)

1> select * from inimene;
2> go
inimene_id nimi saldo
----------- ------------------------- ----------------
1 Martti -16000
2 Pärtel NULL

(2 rows affected)


-- Ok. Seni oli kõik ilus ja lihtne.
-- Võtame nüüd kaks konkureerivat sessiooni.
-- Selguse mõttes on üks ss2k1 ja teine ss2k2

-- Kuna SS2K-s commititakse automaatselt pärast iga DML-i,
-- tuleb transaktsiooni lahti hoidmiseks kasutada abi.
ss2k1>BEGIN TRANSACTION
ss2k1>go

ss2k2>BEGIN TRANSACTION
ss2k2>go

-- Pärtel sai millegipärast 19k.
ss2k1>insert into ost(ostu_id, inimene_id, hind)
values (102,2,19000);

(1 row affected)

-- Teises sessioonis tuleb ntx teine teller ja vaatab, et oli
-- esimese ülekande valele inimesele omistanud..

ss2k2>update ost set inimene_id = 2 where ostu_id = 100;

(1 row affected)

-- Teise sessiooni tegelane mõtleb, et saab veel tehtud jama kiirelt
-- heaks teha ja uuendab mõjutatud inimeste saldod üle.

ss2k2>update inimene set saldo =
(select sum(ost.hind) from ost
where ost.inimene_id = inimene.inimene_id)
where inimene.inimene_id in (1,2);

-- Nüüd SS2K-s jääb asi ootele, kuna esimene sessioon parajasti
-- uuendas ühte kirjet.


-- Esimeses sessioonis tahab teller Pärtli tehingu tõttu samuti saldot
-- uuendada, kuid ta ei saa - jääb ootele, kuna teine sessioon on
-- enda poolt uuendatud väljad lukustanud kuni transaktsiooni lõpuni.

ss2k1>update inimene set saldo =
(select sum(ost.hind) from ost
where ost.inimene_id = inimene.inimene_id)
where inimene.inimene_id = 2;

-- Nüüd mis juhtub. Mõlemad tahavad uuendada saldosid isikul 2.
-- Samas teise sessiooni uuendus oli juba lukus. Ja nüüd on seda
-- ka esimese uuendus. Sellist olukorda kutsutakse surmalukuks
-- ehk siis (välja-)maakeeli deadlock.
-- Kuidas käitub nüüd SQL Server? Ta valib randomiga ühe transaktsiooni
-- välja ja rullib kogu asja tagasi. Kõik muudatused kaovad.
-- Väljastatakse järgmine teade tagasirullumise kohta:

Msg 1205, Level 13, State 50, Server MARTTI\SS2K, Line 1
Transaction (Process ID 52) was deadlocked on {lock} resources with another
process and has been chosen as the deadlock victim. Rerun the transaction.

-- Lõpetame transaktsiooni ja mõlemad sessioonid näevad üht-sama
ss2k1> commit transaction
go

1> select * from ost;
2> go
ostu_id inimene_id hind
----------- ----------- --------------------
100 1 11000
101 1 -27000
102 2 19000

1> select * from inimene;
2> go
inimene_id nimi saldo
----------- ------------------- -----------
1 Martti -16000
2 Pärtel 19000


-- Seega võib öelda, et kuna Pärtel kasutaks SS2K-d, sain ma
-- puhta muidu 11k tema kulul :). (Omaette teema on muidugi, et kuidas
-- üldse oleks taoline kala tekkida, kuid näite mõttes kõlbab.)

Näide ei tulnud võibolla sellepärast kõige parem, et pidin justkui "sunniviisiliselt"
kasutama transaktsiooni loomiseks SS2K-s abi, kuid Oracles just nii transaktsioonid
töötavadki (võiks öelda, et transaktsioonid üldiseltki, ANSI järgi).
SS2K-s on autocommit peal seetõttu, et vastasel korral jääksidki asjad lukku ja
deadlockid oleks tavaline.
Püüdsin olukorda sarnaselt simuleerida.
Samuti on tihti vaja ka SS2K-s luua sarnaselt transaktsioone, sest mõned
muudatused peavad käima järjest. Seetõttu kui käsitsi taoliselt paar muudatust teha,
ei tule transaktsiooni kasutamise vajadus just väga teravalt esile.
(Oracles oleks pidanud SS2K autocommiti simuleerima sel juhul lihtsalt
iga DML-i taha "commit;" pannes, kuid atomaarse transaktsiooni
nõuete korral poleks see võimalik olnud.)

Kui oleks jätnud sarnaselt defaultina SS2K-s oleva autocommiti peale,
oleks siin aga tõusnud küsimuseks järgmine probleem, milleks on
erinevate uuenduste (UPDATE), täienduste (INSERT), kustutamiste (DELETE) ja
pärimiste (SELECT) tõttu pandavad lukud. Kusjuures oluline on just, et
SS2K paneb luku isegi SELECTi tehes.
Reaalses elus ei tehta kalli raha eest süsteeme ühele kasutajatele, reeglina..
Mõtle nüüd kui kümme/sada inimest tahavad tabelist korraga andmeid - tabel on lukus ja
kordamööda saadakse andmeid. Mõtle nüüd, kui keegi tahaks ühte veergu samal ajal
pisut uuendada - ta ei saaks seda teha, kuna tabel on tühipalja SELECT * -pärast lukus.
Korruta seda veel 10, 100, 1000.. jne kordadega ja sa saad täiesti hullumeelselt
eskaleeruvad rea- ja tabeli-taseme lukud.

SS2K-st backupi tegemine on selle pärast arvatavasti väga piinarikas (arvan).

Üks võimalus on ka lubada räpast lugemist ehk dirty read-i, mis sisuliselt tähendab,
et ei hoolita lukkudest ja loetakse kasvõi tabelit, mille uuendus on poole peal.
(Huvitav, kas see tähendab, et ntx poolte inimeste saldo on kalkuleeritud ja
teistel on pooleli? Arvan, et just seda see tähendakski.)
Igaljuhul räpane värk sellisel juhul.


Lühidalt: SS2K-s on lukud ja versioonimine implementeeritud täiesti erinevalt Oracle-st
ja see tekitab tihtipeale tõsiselt peavalusid. Räpane lugemine vs eskaleeruvad lukud. Pole
just parim valik. Näen võimalikke probleeme backuppimisel (mis on lugemine, mis samuti lukustab)
ja segamini uuendamisel, kus deadlocki tekkides ainult üks transaktsioon läbi läheb.
Ülejäänud peavad "uuesti üritama". Samas, kui tegu on protseduuriga, mis ühe korra uuendabki,
siis ei tule keegi "uuesti üritama" ja need muudatused ongi kadunud baasi jaoks.

Microsofti asjadega on väga lihtne lihtsaid asju teha (loe: õppekursusel näidatud
lihtsaid näiteid järgida), kuid reaalne elu on tihtipeale pisut teine.

Aga võibolla siin on tegu maailmavaatelise erinevusega?
Võibolla mõnedele ei olegi oluline kvaliteet, vaid muud asjad. Ühed nauduvad kvaliteetkultuuri,
kuid suuremjagu lepivad raadio-/telemüraga, kuna nii on lihtsam ja kiirem.


PS
Kui olin eelneva osa valmis kirjutanud, siis kõrvalmasinas installisin parajasti
SQL Server 2005-te, et proovida sama asja sellel. Kui lõpuks mahuka installiga maha sain,
siis läksin lootustandvalt "Server Management Studiosse" ja tegin sama asja läbi.
Ei midagi uut siin ilmas. Teade oli vist natike teistsugune. Polnud servu nime küljes.

Msg 1205, Level 13, State 51, Line 1
Transaction (Process ID 51) was deadlocked on lock resources with another process and
has been chosen as the deadlock victim. Rerun the transaction.


PPS
Mind ootas esialgu installimisel ees vägagi kole üllatus - nimelt nõuab isegi SS2K5
custom install, et vindil nimega C: oleks ~1.6GB vaba ruumi, kuigi ma kõik võimalikud
komponendid olin suunanud teisele partitsioonile. (Mistõttu ühele masinale mul ei õnnestunudki SS2K5 peale installida - ma ei saa nii palju vaba ruumi ühe näpuklõpsuga peapartitsioonil.)
Tegu on evaluation copyga, mistõttu sügavalt loodan, et selline jube viga on
tavariliisis ära parandet. Kus seda enne on nähtud, et ma ei saa valida programmi
installikataloogigi - custom mode'is?

Thursday, August 18, 2005

Realplayer download

Found a good RealPlayer download site by accident. You won't be bothered with the usual bugging registrations and ads.

http://www.real.com/freeplayer/?rppr=oracle

Oracle documentation

If you move around alot or just don't have the Oracle documentation in your bookmarks, you might get a bit frustrated when going to www.oracle.com and trying to get to the Oracle 9iR2 documentation etc. But I found a nice little page right in the top level domain, where you can navigate to your favourite docs with no problem and probably with only three clicks - and the third one is already in the "Contents" section of your preferred help document.

docs.oracle.com

Other one is a search from all the major products documents:

http://tahiti.oracle.com/

Wednesday, August 10, 2005

ORA-12638 Credential retrieval failed

Trying to install Ora9iR2 and the server installation was already completed - the Installer was already at the Configurations and the weirdest thing happened - error in the title came from nowhere.

In short, I yelled, I googled and here it is - you have to go to your ORA_HOME\network\admin\sqlnet.ora and change the following line

SQLNET.AUTHENTICATION_SERVICES= (NTS)

into this

SQLNET.AUTHENTICATION_SERVICES=(NONE)

It disables the OS-based authentication, I think. That's why it failed. I didn't have oracle user named the same as my OS user..


Or maybe I'm completly on the wrong tracks - but it works, so I don't care as much anymore. (Well, I'd like to care, but there really isn't that much time in my hands, so I'll postpone the thorough investigation to the next time this happens :) )

How to manually remove Oracle 9i in WIndows

  • Uninstall all Oracle components using the Oracle Universal Installer (OUI).
  • Run regedit.exe and delete the HKEY_LOCAL_MACHINE/SOFTWARE/ORACLE key. This contains registry entires for all Oracle products.
  • Delete any references to Oracle services left behind in the following part of the registry:
    HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/Ora*
    It should be pretty obvious which ones relate to Oracle.
  • Reboot your machine.
  • Delete the C:\Oracle directory, or whatever directory is your ORACLE_BASE.
  • Delete the C:\Program Files\Oracle directory.
  • Empty the contents of your c:\temp directory.
  • Empty your recycle bin.
At this point your machine will be as clean of Oracle components as it can be without a complete OS reinstall.

Remember, manually editing your registry can be very destructive and force an OS reinstall so only do it as a last resort.