Wednesday, June 27, 2012

Improve the quality of your java applications


When implementing and debugging a class, it is sometimes useful to state conditions that should be true at a particular point in a method. These conditions, called assertions, help ensure a program’s validity by catching potential bugs and identifying possible logic errors during development. Preconditions and postconditions are two types of assertions. Preconditions are assertions about a program’s state when a method is invoked, and postconditions are assertions about a program’s state after a method finishes. While assertions can be stated as comments to guide the programmer during development, Java includes two versions of the assertstatement for validating assertions programatically. The assert statement evaluates a boolean expression and determines whether it is true or false. The first form of the assert statement is assert expression; This statement evaluates expression and throws anAssertionError if the expression is false. The second form is assert expression1 : expression2; This statement evaluatesexpression1 and throws an AssertionError with expression2 as the error message if expression1 is false. You can use assertions to programmatically implement preconditions and postconditions or to verify any other intermediate states that help you ensure your code is working correctly. The example in Fig. 13.9 demonstrates the functionality of the assert statement. Line 11 prompts the user to enter a number between 0 and 10, then line 12 reads the number from the command line. The assert statement on line 15 determines whether the user entered a number within the valid range. If the user entered a number that is out of range, then the program reports an error. Otherwise, the program proceeds normally.

Fig. 13.9 Checking with assert that a value is within range.
   1  // Fig. 13.9: AssertTest.java
   2  // Demonstrates the assert statement
   3  import java.util.Scanner;
   4  
   5  public class AssertTest
   6  {
   7     public static void main( String args[] )
   8     {
   9        Scanner input = new Scanner( System.in );
  10        
  11        System.out.print( "Enter a number between 0 and 10: " );
  12        int number = input.nextInt();
  13        
  14        // assert that the absolute value is >= 0
  15        assert ( number >= 0 && number <= 10 ) : "bad number: " + number;
  16        
  17        System.out.printf( "You entered %d\n", number );
  18     } // end main
  19  } // end class AssertTest

 Enter a number between 0 and 10: 5You entered 5 
 Enter a number between 0 and 10: 50Exception in thread "main" java.lang.AssertionError: bad number: 50        at AssertTest.main(AssertTest.java:15) 

Assertions are primarily used by the programmer for debugging and identifying logic errors in a application. By default, assertions are disabled when executing a program because they reduce performance and are unnecessary for the program’s user. To enable assertions at runtime, use the -ea command-line option when to the java command. To execute the program in Fig. 13.9 with assertions enabled, type
java -ea AssertTest

SQL Tuning/SQL Optimization Techniques


1) The sql query becomes faster if you use the actual columns names in SELECT statement instead of than '*'.

For Example: Write the query as
SELECT id, first_name, last_name, age, subject FROM student_details;
Instead of:
SELECT * FROM student_details;

2) HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes. 
For Example: Write the query as
SELECT subject, count(subject) 
FROM student_details 
WHERE subject != 'Science' 
AND subject != 'Maths' 
GROUP BY subject;
Instead of:
SELECT subject, count(subject) 
FROM student_details 
GROUP BY subject 
HAVING subject!= 'Vancouver' AND subject!= 'Toronto';

3) Sometimes you may have more than one subqueries in your main query. Try to minimize the number of subquery block in your query. 
For Example: Write the query as
SELECT name 
FROM employee 
WHERE (salary, age ) = (SELECT MAX (salary), MAX (age) 
FROM employee_details) 
AND dept = 'Electronics'; 
Instead of:
SELECT name 
FROM employee
WHERE salary = (SELECT MAX(salary) FROM employee_details) 
AND age = (SELECT MAX(age) FROM employee_details) 
AND emp_dept = 'Electronics';

4) Use operator EXISTS, IN and table joins appropriately in your query. 
a) Usually IN has the slowest performance. 
b) IN is efficient when most of the filter criteria is in the sub-query. 
c) EXISTS is efficient when most of the filter criteria is in the main query.
For Example: Write the query as
Select * from product p 
where EXISTS (select * from order_items o 
where o.product_id = p.product_id)
Instead of:
Select * from product p 
where product_id IN 
(select product_id from order_items

5) Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship. 
For Example: Write the query as
SELECT d.dept_id, d.dept 
FROM dept d 
WHERE EXISTS ( SELECT 'X' FROM employee e WHERE e.dept = d.dept);
Instead of:
SELECT DISTINCT d.dept_id, d.dept 
FROM dept d,employee e 
WHERE e.dept = e.dept;

6) Try to use UNION ALL in place of UNION. 
For Example: Write the query as
SELECT id, first_name 
FROM student_details_class10 
UNION ALL 
SELECT id, first_name 
FROM sports_team;
Instead of:
SELECT id, first_name, subject 
FROM student_details_class10 
UNION 
SELECT id, first_name 
FROM sports_team;

7) Be careful while using conditions in WHERE clause. 
For Example: Write the query as
SELECT id, first_name, age FROM student_details WHERE age > 10;
Instead of:
SELECT id, first_name, age FROM student_details WHERE age != 10;
Write the query as
SELECT id, first_name, age 
FROM student_details 
WHERE first_name LIKE 'Chan%';
Instead of:
SELECT id, first_name, age 
FROM student_details 
WHERE SUBSTR(first_name,1,3) = 'Cha';
Write the query as
SELECT id, first_name, age 
FROM student_details 
WHERE first_name LIKE NVL ( :name, '%');
Instead of:
SELECT id, first_name, age 
FROM student_details 
WHERE first_name = NVL ( :name, first_name);
Write the query as
SELECT product_id, product_name 
FROM product 
WHERE unit_price BETWEEN MAX(unit_price) and MIN(unit_price)
Instead of:
SELECT product_id, product_name 
FROM product 
WHERE unit_price >= MAX(unit_price) 
and unit_price <= MIN(unit_price)
Write the query as
SELECT id, name, salary 
FROM employee 
WHERE dept = 'Electronics' 
AND location = 'Bangalore';
Instead of:
SELECT id, name, salary 
FROM employee 
WHERE dept || location= 'ElectronicsBangalore';
Use non-column expression on one side of the query because it will be processed earlier.
Write the query as
SELECT id, name, salary 
FROM employee 
WHERE salary < 25000;
Instead of:
SELECT id, name, salary 
FROM employee 
WHERE salary + 10000 < 35000;
Write the query as
SELECT id, first_name, age 
FROM student_details 
WHERE age > 10;
Instead of:
SELECT id, first_name, age 
FROM student_details 
WHERE age NOT = 10;
8) Use DECODE to avoid the scanning of same rows or joining the same table repetitively. DECODE can also be made used in place of GROUP BY or ORDER BY clause. 
For Example: Write the query as
SELECT id FROM employee 
WHERE name LIKE 'Ramesh%' 
and location = 'Bangalore';
Instead of:
SELECT DECODE(location,'Bangalore',id,NULL) id FROM employee 
WHERE name LIKE 'Ramesh%';
9) To store large binary objects, first place them in the file system and add the file path in the database.
10) To write queries which provide efficient performance follow the general SQL standard rules.
a) Use single case for all SQL verbs
b) Begin all SQL verbs on a new line
c) Separate all words with a single space 
d) Right or left aligning verbs within the initial SQL verb

Showing Execution Plan details for a Query


Use these commands one by one the you will get the execution plan for a query:
set auto on;
set autotrace on;
select * from employee;



select * from employee;




       ENO ENAME                                 MID
---------- ------------------------------ ----------
         1 syam                                    2
         2 ram                                     2
         3 raj                                     2
         4 rajgopal                                3
         5 satish                                  3
         6 sundar                                  3


6 rows selected.




Execution Plan
----------------------------------------------------------
Plan hash value: 2119105728


------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |     6 |   258 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| EMPLOYEE |     6 |   258 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------


Note
-----

Hadoop



Hadoop is a large-scale distributed batch processing infrastructure. While it can be used on a single machine, its true power lies in its ability to scale to hundreds or thousands of computers, each with several processor cores. Hadoop is also designed to efficiently distribute large amounts of work across a set of machines.
How large an amount of work? Orders of magnitude larger than many existing systems work with. Hundreds of gigabytes of data constitute the low end of Hadoop-scale. Actually Hadoop is built to process "web-scale" data on the order of hundreds of gigabytes to terabytes or petabytes. At this scale, it is likely that the input data set will not even fit on a single computer's hard drive, much less in memory. So Hadoop includes a distributed file system which breaks up input data and sends fractions of the original data to several machines in your cluster to hold. This results in the problem being processed in parallel using all of the machines in the cluster and computes output results as efficiently as possible.