Nginx Book: Chapter 2


NGINX Configuration Basics

NGINX can be configured through its configuration file ‘nginx.conf’. The default nginx.conf file (that comes with installation) is configured to listen on port 80 and will serve a default test page (as shown in figure 1.1 of chapter 1).

Now instead of taking this default nginx.conf file as a starting point, we will take a bottom up approach where we’ll try to explain various aspects of nginx.conf file such as:

  • where it’s located?
  • what is its file format?
  • what are it’s building blocks?
  • how is it structured?
  • how to configure a simple web server?
  • how to verify our configuration changes?
  • how to apply these changes on a to running service?

So lets start our journey!

2.1 – File Location

On most of the linux distributions, the standard location of this file is


Other common locations of this file are:


You can also findout this path using the command:

$ nginx -V 2>&1 | --color -- --conf-path=

Running this command will highlight the ‘–conf-path=’ so you can see the path.

On windows, if you have setup the nginx under ‘c:\nginx’, then it will be under the ‘conf’ folder, i.e.


2.2 – File Format

nginx.conf is a plain text file that can be edited in any text editor you are comfortable with. However for some popular text editors third party plugins are now available to help you in defining the configuration.

Visual Studio Code comes with a community plugin for writing the nginx.conf file. Starting with this may be useful as it provides contextual help.

2.3 – Building Blocks

NGINX is a software whose functionality is based on modules. These modules expose their functionality in the form of commands known as directives.

Modules are classified as core and non-core. All the basic functionality is embedded in core modules.

Directives are the basic building block. They can be considered analogous to a function of a typical programming language. Similar to function parameters, we pass one or more input values to directive with space as a separator.

Directives are classified as simple and block directive.

Simple Directive

As the name suggests they are simple commands of the form

<directive> <value1> <value2>;

Each simple directive is defined on a single line that ends with a semicolon ‘;’. Most of the simple directives just accepts a single value.

As an example consider how a simple directive is defined within nginx.conf file:

#set the number of cpu cores to use (one per worker process)
worker_processes 2;

here you need to understand:

  • # symbol is used to define comments
  • worker_processes is a directive which tells the master process about how many worker processes needs to be spawned
  • 2 is the value (as parameter) that tell master process to spawn two worker processes
  • a semicolon ‘;’ is required at the end of command

Refer to section 1.3 (chapter 1) if you need some explanation on master and worker process.

Block Directive

A block directive is like a container that can hold one or more simple and block directives. Its form is something like:

<block1> {

	<directive1> <value1>;
	<directive2> <value2>;
	<block2> {
		<directive3> <value3>;


  • block1 is the block directive that contains two simple directives and a single block directive (block2)
  • directive1, directive2 and directive3 are simple directives
  • value1, value2, value3 is the parameter value of respective directive
  • block2 is another block directive which is contained within the parent block (block1)

As an example consider how a block directive is defined within nginx.conf file:

events {
	worker_connections 1024;

here we need to understand

  • events is the name of block directive whose purpose is to define how the worker processes will perform connection processing
  • { } defines the scope of the block directive
  • within the { }, a simple directive worker_connections tells a worker process the limit of concurrent connections
  • 1024 is the value that tells a worker process to open a maximum of 1024 concurrent connections

2.4 – File Structure

There is an underlying structure in nginx.conf file that needs to be understood well. This structure is based on ‘context

A context is a area (or region) within the nginx.conf file where certain directives (simple/block) becomes eligible. Using any directive outside its eligibility area, results in an invalid configuration file.

In general, there are two types of context:

  1. Main Context
  2. Block Context

Main Context

This is the top most context and is also referred to as global or file context. A very limited set of directives can be defined at this level. Some commonly used simple directives at this level are:

Simple Directive Description
user Defines the OS user and group that will own the spawned worker process. It accepts two parameters. The first parameter is the OS user name while the second parameter is the OS group name.
worker_processes Defines the total number of worker processes that needs to be spawned. It accepts only one parameter. Parameter values can be postive integer values less than or equal to number of cpu cores of the system. If you want the system to auto assign the value based on the cpu cores, then you should set the parameter value to ‘auto
pid Defines the file name and path of the .pid file of master process

At main context level, only the following 4 block directives are eligible:

Block Directive Description
events Defines how the worker processes will perform connection processing.
http Defines how the http traffic will be managed
mail Defines how the email traffic will be managed
stream Defines tcp/udp traffic will be managed

Block Context

Each block directive sets its own context. This context is called block context. Since a block directive can contain another block directive, the containing block context is referred to as parent context while the contained block context is referred to as child context.

For example, consider http directive:

http {
	server {


  • http is a block directive and its context is referred to as http context
  • server is another block directive within the existing http context and its context is referred to as server context
  • from server context, the http context is referred to as parent context
  • similarly from http context, the server context is referred to as child context

This parent-child relationship between block context results in:

  • Any eligible directive that is defined at parent level, but not at child level will inherit it from parent context.
  • If this directive is also defined at child level, then within child context, its value will be used (instead of value from parent context)

2.5 – Configure A Simple Web Server

Now its time to apply what we have learned so far. We’ll setup an simple web server that will serve files from file system.

#you can give name of any valid operating system user 
user nobody;

#here we only want a single worker process for processing http requests
#you may select as many as your system allows
worker_processes 1;

#the default file name and path where it will be placed. 
#this file contains the process id of our master process 
pid /run/;

#a block directive to for configuring connection processing logic
events {
	#specifies how many concurrent connections a worker process can accept
	worker_connections 1024;

#a block directive for handling http traffic
http {

	#another block directive that is used to define a virtual host
	server {
		#the port on which this virtual host listens
		listens 80;
		#the path from where this server will serve files 
		#also known as document root
		#you can set this to any of your existing folder
		root /usr/share/nginx;

So far we have created a configuration file from scratch. In the next two section we’ll learn how to verify and apply changes.

2.6 – Verify Configuration Changes

To verify whether our file is syntatically correct or not, use the following command

$ sudo service nginx configtest

If it results in

Testing nginx configuration … [OK]

then nginx.conf file is in a valid state that you can apply. However if it results in

Testing nginx configuration … [FAIL]

Then you need to review your changes in order to fix the root cause.

2.7 – Apply Configuration Changes

Assuming your nginx.conf file is in valid state, you can apply configuration changes to running service using the following command

$ sudo service nginx reload

With above command the master process is not affected. Only worker processes are gracefully recreated. This is quite useful feature as it give ability to make configuration changes without disrupting your service.

In a currently running service, if you mistakenly runs the above command when the nginx.conf file is in invalid state, the reload will fail without disrupting the existing service.

2.8 – Review Questions

Q.1 – Using command (discussed in section 3.1) find out the path of the the file.

Q.2 – All simple directives just accept a single parameter value. [True/False]

Q.3 – All block directives require curly braces {} after its name. [True/False]

Q.4 – Which of the following block directive cannot be used in main context:

  • http
  • server
  • events

Q.5 – If your system has 4 cpu cores. Setting

	worker_connections auto;

will result in spawning of how many worker processes?

Q.6 – Consider the following snippet:

	http {
		root /usr/share/nginx;
		server {
			listen 80;
		server {
			listen 90;
			root /home/moiz;

What will be the document root (path from where user request will be served) for:

  1. http://localhost/
  2. http://localhost:90/

Q.7 – Take a note of process ids assigned to nginx services on your system. Now reload (instead of restarting as done in Q.6 of previous chapter) the nginx service. Again take a note of assigned process ids. Do you got the same process ids?

2.9 – Answers to Review Questions

A.1 – In section 3.1 we looked for the path of nginx.conf file with the search keyword ‘–conf-path=’. Now just replace the search keyword with ‘–pid-path=

	$ nginx -V 2>&1 | grep --color -- --pid-path=

In our case, the is located at:


A.2 – False.
While most of the simple directives accept a single parameter value, but this in not true for all. For example the simple directive ‘user’ accepts 2 parameter.

A.3 – True.

A.4 –server’ block directive cannot be defined directly under the main context. It can only come under the http context.

A.5 – 4 worker processes will be spawned

A.6 – http://localhost/: As no root path has been defined for this virtual server, therefore it will inherit the root path defined in the parent context. So the document root of this virtual server will be /usr/share/nginx.

http://localhost:90/: As root path has already been defined for this virtual server, it will override any such settings defined in the parent context.

A.7 – When we run the command,

	sudo service nginx reload

only worker processes are recreated. There is no change in master process id.

Posted in Nginx | Leave a comment

Nginx Book: Chapter 1


NGINX: Introduction

NGINX (pronounced as Engine-X ) is a lightweight, high performance web server. What this means is it can handle more than ten thousand concurrent request (c10k) with only 2.5 MB of memory footprint.

In 2005, handling ten thousand concurrent request was a challenging task for web servers of that time. Igor started work on NGINX as a way to solve this challenge.

The goal of this chapter is to introduce to yourself the NGINX technology. After reading this chapter you’ll be able to

  • install nginx
  • start and stop nginx
  • explain what happens when nginx service is started

If you have a good understanding of above topics then you can move to next chapter that deals with the configuration aspects of NGINX server.

1.1 – Install NGINX

In this chapter we will only cover installation from pre-built packages of underlying operating system. If you want the latest version from Official NGINX Repository or want to compile from source, please consult the official documentation.

Debian/Ubuntu Linux

To install nginx on debian or ubuntu linux, run

#update repository
$ sudo apt-get update

#install nginx
$ sudo apt-get install nginx

Once installed, you can verify it as

$ sudo nginx -v

running this command will display the version of installed nginx

nginx version: nginx/1.10.3 (Ubuntu)

RHEL/CentOS Linux

To install NGINX on RHEL or CentOS linux, run

#install EPEL repository
$ sudo yum install epel-release

#update repository
$ sudo yum update

#install nginx
$ sudo yum install nginx

#verify installation
$ sudo nginx -v

If your server is restricted to directly talk to internet, then simply download the .rpm package from RHEL repository and run the following command to install:

$ sudo yum install --offline nginx-xx.rpm


As of March 2018, windows version is still not recommended for production use due to limited performance and scalability challenges. However if your goal is just to learn and experiment, then this should suffice your need.

To install the windows version, just download the .zip version from official repository. 1 Unzip the file to a folder and run the following command to verify it:

c:\nginx\nginx -v

1.2 – Start/ Stop NGINX


On linux, installing from pre-build packages will install it as a service. So service command can be used to start / stop it.

#start nginx
$ sudo service nginx start

The default nginx.conf starts the service on port 80. So in a web browser openning the url,


will result in

Figure 1.1

Figure: 2.1: Default Test Page (index.html)

Similarly to stop the service, run

#stop nginx
$ sudo service nginx stop


To start nginx on windows, run

#start server
start nginx

Similarly to stop, run

#fast shutdown
nginx -s stop 

#graceful shutdown
nginx -s quit

Please note that from now onwards, we’ll limit our discussion to linux only.

1.3 – Command List

Start/Stop are just the two basic commands we learned above. There are other useful commands that we must know before moving forward. To get list of all supported commands, run

#list commands
$ sudo service nginx

This will result in

Usage: nginx {start|stop|restart|reload|force-reload|status|configtest|rotate|upgrade}

For better understanding, we have classified the above commands into three sets.

  1. Service Management Commands
  2. Configuration Management Commands
  3. Miscellaneous Commands

In this chapter we just want you to focus on service management commands:

Service Management Commands

Command Description
start Start the server
stop Stop the server
restart stop and then start the server with this single command
status check the current status of server

Configuration Management Commands

Command Description
configtest verifies whether nginx.conf file is valid or not
reload gracefully reload the nginx.conf without restarting the server
force-reload forcefully reloads the nginx.conf

The next chapter will discuss the configuration related commands in detail. Discussion on miscellaneous commands is available in appendix X.

1.3 – NGINX Service Internals

When NGINX server is started, the following things happen:

  1. A master process is spawned at operating system (OS) level
  2. The master process first reads and evaluates the NGINX configuration file (nginx.conf)
  3. The master process then spawns a number of worker processes at OS level (as defined in nginx.conf)
  4. With the creation of worker processes, the server becomes available to client (as client request handling is the primary job of a worker process)
  5. Master process now monitors the worker processes and listens to administrative commands such as stop, status, reload, etc.

A simplified view of NGINX server process architecture is:

Figure 1.2

Figure 2.2: NGINX server process architecture

To check what process ids are assigned to the NGINX service running on our system, just run the status command:

#check status
$ sudo service nginx status

This will tell you

  • since when the service is available
  • process id of master process
  • process id of each worker process

Figure 1.3

Now with this basic knowledge about NGINX server, you are ready to learn the details of nginx.conf file which is the topic of our next chapter.

1.4 – Review Questions

Q.1 – The job of master process is to handle client request. [True/False]

Q.2 – When you starts the nginx service, the worker processes are spawned before the master process. [True/False]

Q.3 – Can you find out the uptime of the nginx service on your system?

Q.4 Can you find out the exact version of NGINX on your system?

Q.5 – In section 2.3, we learned to check the process ids of master and worker processes. Can you find some other way of getting the process ids assigned to nginx service.

Q.6 – Take a note of process ids assigned to nginx services on your system. Now restart the nginx service. Again take a note of assigned process ids. Do you got the same process ids?

Q.7 – Can you find out the path where nginx binary resides on your system?

1.5 – Answers to Review Questions

A.1 – False. Handling client request is the job of worker process.

A.2 – False. Master process is the main controlling process. Its job is to spawn the required number of worker processes.

A.3 – Using the status command we can find out the uptime:

sudo service nginx status

I got 37 minutes (your’s will be different)

A.4 As described in section 2.1, you can use

sudo nginx -v

A.5 – You can make use of the following command

ps -A | grep nginx

With this method we can’t differentiate between the process ids assigned to master and worker processes. The process id of master process is also saved in the /run/ file (as defined in nginx.conf file) from where you can confirm it.

A.6 – No its very unlikely that the operating system reuses the same process id in a very short span of time. To restart the nginx service, we use the command:

sudo service nginx restart

Each restart is similar to as if you first stopped the server and then started it like shown below:

sudo service nginx stop
	sudo service nginx start

A.7 – One simple way of finding the nginx binary is:

	$ which nginx

Which results in:


1. 6 – List of References:

  1. Download NGINX ( ↩︎
Posted in book, Nginx, tutorial | Leave a comment

Visualize Univariate Distribution of a Dataset (using Plotly)

Plotly library makes interactive graphs. Using this library a function ddist has been written for visualization of data distribution of each variable within a dataset. This function may become quite handy during the exploration of any dataset.

Since this function uses plotly library, therefore you must install and load this library before calling the ddist function.


ddist function takes a dataset (of type data.frame) as an input parameter and returns a list containing plotly plot object for each variable. See below:

#Function for generating plot objects for each variable within the dataset (using plotly library). For numeric and integer variables, a histogram plot is generated, while for others a barplot is generated.
#param: data.frame  
#returns: list
ddist <- function(dataset) {
  #create a list for holding the plot objects
  plots <- list(length(dataset))
  #iterate through each variable
  for(i in 1:length(dataset)) {
    #for numeric and integer variables plot histogram  
    if(is.numeric(dataset[,i]) || is.integer(dataset[,i])) {            
      plots[[i]] <- plot_ly(x=dataset[,i]) %>% add_histogram(name=names(dataset)[i])  
    #for remaining plot barplot
    else {
      tbl = table(dataset[,i])
      plots[[i]] <- plot_ly(x=names(tbl), y=tbl, name=names(dataset)[i], type='bar')
  #return list of plots

Now by using the above function we can easily explore the data distribution of each variable within any dataset. For example, the below code passes irisdataset to ddist function and then calls subplot function (of plotly library) to display the resulting plots on two rows:

#generate plotly plot objects
plots <- ddist(iris)

#display plots on two rows
subplot(plots, nrows=2)

Click to view interactive plots of univariate distributions of iris dataset

Similarly passing the diamonds dataset to ddist function results in:

Click to view interactive plots of univariate distributions of diamonds dataset

Posted in Data Analysis, R | Tagged | Leave a comment

Oracle Materialized View Complete Refresh over Database Link (DBLink)

Since Oracle 10g, materialized view can be created in LOGGING mode (default) and NOLOGGING mode. NOLOGGING mode does not provides transactional safety to the refresh process. Therefore NOLOGGING mode is only suitable where the user process itself ensures the data integrity.

Under LOGGING mode, the complete refresh of materialized is a three step process:

  • delete the existing records
  • insert the records from data source
  • if above steps are successful then commit else rollback

This provides two key benefits:

  • during refresh cycle, the materialized view will keep entertaining the user queries with existing data
  • new data will only be available after commit (i.e. when refresh cycle completed with no error)

Ideally, the above approach (LOGGING mode) will be sufficient in most of the scenarios. However, in cases where the materialized view is

  • fully refreshed over DBLink
  • and refresh time is more than couple of minutes
  • and lastly the network over which it is refreshed is not 100% fault tolerant

then in such cases, a network glitch can stuck your materialized view process indefinitely with the following message

SQL*Net more data from dblink

In such scenarios, there are at least two solutions (although you may use both to further optimize and increase the resiliency of refresh process):

Solution 1: Create a job, that kills the sessions of materialized view refresh process that are stuck with ‘SQL*Net more data from dblink’ message. Read more on this here.

Solution 2: Try to reduce the refresh time. With the decrease in refresh time, the chances of refresh process facing a network glitch also reduces. Assuming that your data source query is already optimized, you may further reduce the refresh time as described below:

Step 1) Create materialized view in NOLOGGING mode (with no indexes and no auto refresh schedule) on machine where original materialized view exists


Step 2) Recreate the original materialized view in LOGGING mode with the above materialized view as the local data source (with required indexes and no auto refresh schedule)


Step 3) Create a refresh procedure that performs the following tasks:

1- refresh the materialized view created in NOLOGGING mode

2- verify the data integrity between the the local and remote data source

3- if verified then do the complete refresh of the materialized view (created in LOGGED mode)

iTotalLocalRecords Integer;
iTotalRemoteRecords Integer;
--refresh mv in nologging mode

--verify record length

--refresh only if count matches
IF iTotalLocalRecords = iTotalRemoteRecords THEN


Step 4) Schedule the procedure execution at your desired frequency

In our organizational setting, one of the materialized view (created in LOGGING mode), the refresh process required more than 25 minutes for complete refresh over dblink (with more than 0.5 millions records and a number of indexes). At least every once a month this refresh process got stuck indefinitely with ‘SQL*Net more data from dblink’ message.
After applying the above fix, the complete refresh time reduced to around 10 minutes (of which only less than 20% network time was required):

  • 1 to 2 minutes for materialized view refresh (created in NOLOGGING mode over DBLink)
  • 8 to 10 minutes for materialized view refresh (created in LOGGING mode with a number of indexes)

The cost of this solution is that our system now has to keep two copies of the same data. However we further optimized the above solution by replacing the materialized view created in NOLOGGING mode with a table created in NOLOGGING mode. At the end of our refresh procedure we truncates the table to save this space.


Posted in Oracle, Replication | Tagged , , , , , , , , , , , , | 1 Comment

How to install a (portable) Oracle SQL Developer in Windows

Setting up Oracle SQL Developer on any machine may require less than 5 minutes provided you have access to high speed internet along with administrative rights of your machine. However if you are one of myself, working in an organization having restricted access, the portable version may be quite useful.

After googling around, i managed to create a portable version of Oracle SQL Developer 4.1 for windows platform. Below is the summary of steps (URL references) you may find useful in creating a portable version:

Step 1: Download Java (e.g. jdk-8u45-windows-i586.exe for 32-bit windows)

Step 2: Create a portable version of Java
Details of this step are nicely defined at ‘How to install a (portable) JDK in Windows without admin rights‘.

Step 3: Download Oracle SQL Developer (e.g

Step 4: Unzip the downloaded file to say ‘d:\sqldev-portable\’ folder

Step 5: Create a ‘jdk’ folder within ‘d:\sqldev-portable\sqldeveloper\’ folder

Step 6: Copy portable version of Java to the ‘jdk’ folder (as created in previous step)

Step 7: Define user directory. This is the directory where sqldeveloper will save user specific settings (on first time use). On windows the default directory for this is

c:\users\myuser\AppData\Roaming\SQL Developer\

However since we are creating a portable version therefore it would be better that these settings are also saved relative to our ‘d:\sqldev-portable\’ folder. Now in order to override the defaults, we need to add the following line to the SQL Developer configuration file (d:\sqldev-portable\sqldeveloper\bin\sqldeveloper.conf) (reference)

AddVMOption -Dide.user.dir=../../.sqldeveloper

Step 8: Now take a backup of ‘d:\sqldev-portable\’ folder before executing the sqldevelper.exe file (within d:\sqldev-portable\sqldeveloper\’ folder).

Posted in Oracle, SQL Developer | Tagged | 9 Comments

Lessons learned from the Book: Passionate Programmer

A few days back i came across this book and found it to be an excellent resource for individuals who not only enjoys doing programming, but some how don’t know how to pursue their passion that can make a real difference or at least attain happiness from their professional work.

A summary of my learning from this book are:

1) If programming (software development) isn’t your passion then its better to switch your career (or at least plan to do so)


    • Very hard to achieve personal satisfaction (happiness) from your work.

In today’s competitive world, around 50% of your wake-up time is spent at workplace, so if that time doesn’t gives you happiness then chances of getting happiness from remaining time is very low.

    • Difficult to keep yourself up-to-date.

With 50% of your time spent at workplace, only your passion can keep you motivated for additional investment of your time in continuous learning activities.

    • Chances of being innovative at your work reduces.

There is an inherent risk of failure associated with every innovation. Failure helps human to clear his/her illusions and learn what not to do. So without passion the ability to face failure and convert failure into a learning becomes very difficult.

    • Life is very precious but time bounded.

Once a time has passed by, it’s gone forever. It would be unwise to spent your time and effort in activities that doesn’t touches you or becomes a source of happiness. For a moment assume that you are now 60 years old and you are sitting on a chair and recalling your past. How would you feel about your past? Does it gives you satisfaction or makes you unsatisfied about your achievements. Associating yourself with your passion seems to be the best bet one can employ for securing his/her future.

2) Understand demand & supply spectrum of software languages, platforms or technologies so as to know where you stands. Certain languages or platforms are in more demand today then others. This demand opens up job opportunities but also attract others to develop this skill. Demanding languages/technologies also offers an incentive for outsourcing companies to offer such services. Hence such demanding languages are more likely to get outsourced. So what this means is that you should also pursue less demanding technologies (for e.g. new technologies) so as to get yourself on top of market demands.

3) Never completely commit yourself to a particular technology/language. This is equivalent to the common saying that don’t put all your eggs in one basket. If that company goes down or stops promoting the specific technology, then this can badly hit your career.

4) Broaden your knowledge spectrum from both ends. This means that you should not only try to develop expertise in specific area (vertical growth) but should also try to develop your knowledge in other relevant areas (horizontal growth). What this means for example to a java expert is that besides putting efforts in knowing the java internals; he/she must equally put efforts in developing  skills in other relevant areas. This may include ability to setup development/test environments, performing software testing, database tuning, etc.

5) Know your true potential by trying to associate yourself with individuals or teams that are better than you. If you cope to survive with them you will at least know that your true potential is no less then them. Contrary to that if you don’t do this then you will remain unaware of your true potential.

6) Get mentoring. Good mentors can really help you accelerate your pace of learning. A mentor can be your team lead, an associate, a book (for me this book really deserves this status).

7) Become mentor.

8) Train yourself for daily plan/execute cycle. This will not only improve your planning/execution skills but will also help you in tracking your progress.

9) Know that your cost of ownership for the company is at least twice your pay. Calculate this for yourself:

Hourly Cost = 2 x (monthly pay) / 160

By knowing how much an hour you cost to the company, you will better be able to evaluate whether the work you delivers really matches the cost.
Making this a habit will help you in maintaining your focus on your planned

10) Few Misconceptions about Programming

– Since i am only working on java platform (and not looking at other technologies/platforms) so i am a java expert

Why: Expertise require an in-depth knowledge.

To become an eye specialist an individual must first complete his/her education in general medicine and then invest additional time/money in pursuing the in-depth knowledge in the area of ophthalmology. Our field is no exception; same logic should follow. Therefore for a person to be called a java expert he/she must have knowledge in the areas like:

– internal working of Java VM (virtual machine)
– in what scenarios Java VM will under perform or will crash
– internal working of Java application server
– how Java VM differ from other virtual machines
and so on….

– Programmer’s don’t require practice

Why: Humans by nature, develops their skills by practice. If we look around, we finds that in sports, players have regular practice sessions. Similarly in music, musicians, singers do regular practice.

Since programming is also an art, therefore this art can only
thrive with regular practice. By 20/80 rule, around 20% of language features are used during an implementation of a typical project. So typically an average programmer focuses on these 20% features. But what about the remaining 80% features. Are they unimportant. I think these features are what gives distinctive advantage to one language over the other. Learning these feature demands extra practice. So for a passionate programmer, practice is a key element that differentiates him over the others.

11) Perception matters. Accept this reality. It doesn’t matters how good you are at coding or making designs/architectures unless it helps your manager or customer in developing a positive perception about yourself. What this means is that you need to put extra efforts in marketing your work. This can be a presentation, a training, a white paper or even an email communication (whatever applies).

12) Always be ready for elevator speech. This means that whenever your boss, manager or CEO meets you and asks about your current activity, you should be able to describe your work in terms of what impact this will have on business. Inability to describe your work in term of business impact makes you and your work irrelevant and hence develops a negative perception.

13) Improve your written communication skill. In face to face communication you are not fully dependent on your words. Your facial expression and body language also assists you in communicating your message. However in many cases you have no other option then to communicate your message in written form: emails, letters, reports, etc. Inability to communicate effectively in written form can become a source of distraction.

14) Work habits a passionate programmer should possess are:

– Invest your time/resources in developing your passion but never
use workplace for this purpose
– Develop habit of practicing programming at leisure time

– Don’t consider workplace as a practicing ground
– Don’t use workplace for your personal development
– Don’t check your personal emails, website or blogs at workplace
(purpose is to keep focus on your work)

Posted in Book Review, Passion | Tagged , , , , , | Leave a comment

Run Oracle SQL*Loader from SQL or PL/SQL

In continuation of my previous blog entry where a generic framework for invoking Operating System commands from SQL or PL/SQL was described [read], we will now try to demonstrate how this service can be used to invoke SQL*Loader utility.

Business Case: Suppose you want to load a data file in ‘.csv‘ format containing list of employees to ‘emp‘ table in ‘hr‘ schema from SQL. A step by step instructions for the implementation of this business case is described as follows:

Step 1: Create a folder named  myprojects in c drive

Step 2: Download ‘‘ file and copy it to ‘c:/myprojects‘ folder

Step 3: Download SQL*Loader control file with name ‘emp.ctl‘ and copy it to ‘c:/myprojects‘ folder. Contents of this file are shown below            

load data
infile ”
into table employee
fields terminated by “,”
( code, name, salary )

Step 4: Download a sample data file with name ‘emp.csv‘ and copy it to ‘c;/myprojects‘ folder. Contents of this file is shown below              


Step 5: Create emp table             

create table hr.emp
       code varchar2(30),
       name varchar2(100),
       salary integer

Step 6: Open command window and first change the directory to ‘c:/myprojects‘ and then run the following command

python c:/myprojects/

Step 7: Open SQLPlus and connect to user ‘sys as sysdba‘ and then run the following query

select utl_http.request(utl_url.escape(‘http://localhost:9090/run?command=sqlldr -userid=hr/hr control=./emp.ctl data=./emp.csv log=./emp.log‘)) from dual;

Incase you want to execute the above query from another database user (for example hr) then you need to perform the following:

–  grant execute on utl_http to hr;
–  grant execute on utl_url to hr;
–  configure the Network ACL so that hr user can invoke the web-service

Step 8: Verify the results by issuing the following query            

select count(*) from hr.emp;

Posted in File system, Operating System, Oracle, Python, Web Service | Tagged , , , , , , | 3 Comments

Run Operating System (OS) Command in Oracle PL/SQL: A Web Service Approach

Three years back when i was working on a project in which there was a requirement for execution of OS commands from Oracle 11g, i found the following options:

Option 1: Create Java Stored Procedure and access it through PL/SQL wrapper
Option 2: Use DBMS_SCHEDULER package to schedule an OS command
Option 3: Create a native shared library in C

Technical implementation details of the above options can be found here [1], [2], [3]

After weighing the pros and cons of each approach, i went with the first option. However recently when i was exploring python network library, i found that this simple command can share my files/folder over HTTP interface

python -m SimpleHTTPServer 80

This made me to think that can we use the HTTP interface for invocation of Operating system commands like:

  • running a shell/batch script from Oracle PL/SQL
  • getting list of running processes
  • getting list of files in a folder
  • getting contents of log file

A simple service architectural diagram is shown below:


Now for implementing a proof of concept, i am using python’s subprocess.Popen class (as discussed in my previous blog) for interacting with operating system and cherrypy — a python’s minimalist web framework. A step by step instruction for creating such service is given below:

Step 1: Setup Cherrypy

a) Dowonload cherrypy and unzip to a folder say “c:/cherrypy”

b) Change your working directory to “c:/cherrypy” and run the following command

python install

Step 2: Create a python script named ‘’ and copy the following code into it

import cherrypy, subprocess

#overide the default cherrypy port from 8080 to say 9090
cherrypy.config.update({'server.socket_port': 9090,})

#define set of commands that you want to allow to be executed over this interface
command_set = ['ls','ps','dir','head','tnsping','sqlldr']

class ServiceManager:

   def run(self, command, working_directory=None, environment=None):

      #check whether the given command is allowed over this interface or not
      if command.split()[0] in command_set:

         #run the command
         p = subprocess.Popen(command, shell=False, universal_newlines=True, cwd=working_directory, env=environment, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

         #read the stdout and stderror messages
         output =
         error =

         #if no error reported then format the stdout message
         if len(error) == 0:
            str = self.format(output)
            str = self.format(error)

         str = 'Service Error: This command is not allowed over this interface!'

      #return the message
      return str = True

   def format(self,str):
      lst = str.split('\n')
      result = "
"      for row in lst:         result += "
"      result += "</pre>
 result += row
 result += "</td>
      print result
      return result


Step 3: Start the service by issuing the following command

python c:/myprojects/

Step 4: Test the service in web browser. Note the following url should display the contents of the folder from where the python command was issued in step3


Step 5: Test the service from sqlplus (connect as sysdba).

select utl_http.request(utl_url.escape(‘http://localhost:9090/run?command=ls -l’)) from dual;

select utl_http.request(utl_url.escape(‘http://localhost:9090/run?command=head -5 c:/myprojects/’)) from dual;

select utl_http.request(utl_url.escape(‘http://localhost:9090/run?command=tnsping my_remotedb’))

Please note that the current implementation of formats all the output as html, however this can be modified as per your the requirement by refining the code of format method in ServiceManager class.

When comparing this option with other 3 options(as described in the starting) in terms of security, this interface can pose a higher security risk. However such risks can be mitigated with the following strategies:

  • only allow this interface to be accessible from local machine
  • limit the type of commands that may be executed over this interface

I think with proper risk mitigation strategies, this interface can open up numerous opportunities for performing operating system level operations from within sql or plsql in minimum amount of time.

Posted in File system, Oracle | Tagged , , , , , , , , , , , , , , , , , | 3 Comments

Run Oracle Commands From Python (via SQL Plus)

Often one may want to run a query or execute stored procedure from python without installing any specific modules like cx_Oracle. In such cases it is often more convenient to use SQL Plus (assuming sqlplus is already installed).

In python, Popen class (of subprocess module) can be used to invoke a sqlplus session.

session = Popen([‘sqlplus’,’-S’,’hr/hr’], stdin=PIPE, stdout=PIPE, stderr=PIPE)

The first parameter of Popen is the actual command given as a list of words. This means that if we want to run the command ‘sqlplus -S hr/hr’ then we must pass it as a list of words in the command i.e. [‘sqlplus’, ‘-S’, ‘hr/hr’]. Note that by setting -S flag when invoking sqlplus will prevent it from writing the startup and termination messages to stdout output interface

Once a session is created, we can write our command(s) to stdin input interface (similar to as we would be writing on command window).

session.stdin.write(‘select sysdate from dual;’)
session.stdin.write(‘select first_name from employees;’)

Note that just writing to stdin input interface will not communicate the results to stdout. For this we need to call communicate function.

stdout, stderr = session.communicate()

This function returns two values (a tuple): stdout holds the results of our queries written on stdin input interface while stderr holds the error messages.

The following code creates a wrapper function that takes the sqlCommand and connectString as inputs and returns the queryResult and errorMessage (if any):

from subprocess import Popen, PIPE

#function that takes the sqlCommand and connectString and returns the queryReslut and errorMessage (if any)
def runSqlQuery(sqlCommand, connectString):
   session = Popen(['sqlplus', '-S', connectString], stdin=PIPE, stdout=PIPE, stderr=PIPE)
   return session.communicate()

With the above function, running sql queries, calling stored procedures or executing a sql script file becomes quite easy.

#example 1: run a query that returns a numeric value
connectString = 'hr/hr'
sqlCommand = 'select count(*) from jobs;'
queryResult, errorMessage = runSqlQuery(sqlCommand, connectString)

#example 2: run a query that returns a next value of sequence
connectString = 'hr/hr'
sqlCommand = 'select employees_seq.nextval from jobs;'
queryResult, errorMessage = runSqlQuery(sqlCommand, connectString)

In the above examples, the queryResult variable will hold a string  (i.e.the formatted output displayed by sqlplus). For example in example 1, queryResult will hold a string like shown below:

‘\r\n  COUNT(*)\r\n———-\r\n\t19\r\n\r\n’

where the number 19 in this string resembles the total count of records in jobs table

#example 3: call stored procedure secure_dml
connectString = 'hr/hr'
sqlCommand = 'execute  secure_dml;'
queryResult, errorMessage = runSqlQuery(sqlCommand, connectString)

#example 4: run a sql commands defined in .sql file (assuming that ‘test.sql’ file is on same path from where the python session has been invoked)
connectString = ‘hr/hr’
sqlCommand = ‘@test.sql’
queryResult, errorMessage = runSqlQuery(sqlCommand, connectString)

#example 5: run a sql commands defined in .sql file (similar to example 4) but with some parameter like table name
#----- assuming the content of test.sql file is: select count(*) from &1;
connectString = ‘hr/hr’
sqlCommand = ‘@test.sql employees’
queryResult, errorMessage = runSqlQuery(sqlCommand, connectString)

Please note that the above examples will work on Python Version 2.x. In order to run it on Python Version 3.x, the only change required is to pass the sqlCommand as a bytes object (not as a string object). In python a string can be converted to bytes object by prefixing it with letter b, for example

sqlCommand = b’select count(*) from jobs;’

With this exception, rest of code will run as described above. If you are interested in further details then go through the pages 402-405 of “Python Essential Reference” by David M. Beazley

Posted in Oracle, Python, SQL*Plus | Tagged , , , , , , , , , | 27 Comments