How to Run Scrapy Spiders on Cloud Using Heroku and Redis

January 3rd, 2015 No comments


This tutorial aims to guide its readers install required scrapy plugins for heroku and redis support, deploy a sample spider to heroku and run it periodically (daily, hourly etc.) and store scraped items into a redis instance. We will use free heroku machine and redis add-on so you can have a running spider on the cloud for free.
Devamını Oku…

Three Levels of API Testing

December 15th, 2013 1 comment

A great percentage of developers are dealing with APIs everyday. We either integrate a third party API into our application or we develop APIs and make them accessible to other developers. As the importance of API development grows, we also realize that being able to test these APIs properly is a crucial task. There are various ways and levels of testing that can be done on APIs and here I would like to discuss three testing levels providing comprehensive coverage when put together and worked pretty well for me in my recent experience.

1. Unit Testing

Unit testing is around for a long time and is proven to be very helpful if done correctly. This level of testing is appropriate for verifying the units of our application, which correspond to a class in OO languages. It is very important to note that a unit test should be testing a single unit, isolated from all of its dependencies. Software modules depend on each other, but because unit testing is intended for testing a unit and not its dependencies, we usually mock the dependencies and give them predefined behaviors to verify that the unit-under-test is working the way we desire.

Mocking is a key concept in unit testing. In Java, there are plenty of mocking frameworks we can use along with JUnit. JMockit, Mockito and EasyMock can be given as examples. I personally prefer JMockit but other frameworks should also provide the same functionality. These frameworks usually utilize annotations in order to mark dependency objects and provide tools to give them predefined behaviors when invoked in the unit we are testing. Following is an example for JMockit:

   public void doBusinessOperationXyz(@Mocked final Dependency mockInstance)
      new NonStrictExpectations() {{
         // An expectation for an instance method:
         mockInstance.someMethod(1, "test");
         result = "mocked";

      // A call to the unit under test occurs here, leading to mock invocations
      // that may or may not match specified expectations.

In this unit test we mark Dependency object as a dependency by using @Mocked annotation and specify that when someMethod gets invoked with parameters 1 and “test” respectively, the dependency object should return “mocked” string value. When the unit we are testing invokes someMethod with the parameters we specified, instead of executing the real implementation of Dependency class, JMockit takes the control and returns “mocked” string at runtime, ignoring the implementation of someMethod. This way we are isolating the unit from its dependencies and focusing on the behavior of unit under test.

Mocking is crucial when we have to deal with database interactions, external services or I/O operations. When we unit test a class that depends on a DAO module to return data from whatever your data source is, we should mock this interaction to prevent making a real call to data source. As a principle, unit tests don’t use real database connections, they don’t use network to access external services and they don’t perform I/O operations. When writing unit tests, we need to keep in mind that we are not testing the interaction between the unit we are testing and its dependencies. So it’s best to use mock interactions to keep tests simple, fast and reliable.

Writing unit tests is considered a part of software development. We should run them as often as we can and tie them to build phase of the application. It’s worth spending the time to keep unit test coverage high and we should be aware of what percentage of our code is covered by unit tests. There are great tools like Cobertura to keep track of unit test coverage in your application. It can generate reports for your source code and gives you valuable information about line and branch coverage of your classes, it even shows you which lines are not covered in your tests. More advanced tools like Sonar, besides code coverage, analyzes your code and gives you tips to improve code quality, points you to places where you might have a bug.

2. Integration Testing

Integration tests, as the name suggests, are supposed to test integration between different components of software. As opposed to unit testing, integration tests don’t mock dependencies. Assuming that we expose a RESTful endpoint that returns the data it fetches from a database in JSON format, we could write an integration test that hits this endpoint and verifies response data along with HTTP response code. In this test, Restful service would be connecting to a real database and execute proper SQL queries. The point here is to ensure that interactions between software components are correct.

Integration tests provide broader coverage for your application since they test multiple components of the software. They are also more fragile as they might be affected by changes to environment and they have to rely on accessibility of other parts of the system, like a data source or an external web service. Whereas in unit tests we only rely on pure Java code, which makes it possible to run them during each build or whenever we change a line of code and wonder if we broke anything. Also, running an integration test suite would take more time because of the reasons we mentioned earlier. Therefore, it makes a lot of sense to keep integration tests apart from unit tests and not run them as often as unit tests. A good practice might be running them after each deploy to a test environment.

A rule of thumb when writing integration tests is to define them in a way that they can be executed multiple times consistently. The beauty of software testing reveals itself when you run the tests over and over again and see what you broke as you change the code. It’s not fun to figure out tests are failing because they are written improperly. Our goal should be writing integration tests that can be run multiple times, and this can be achieved by making sure that individual tests don’t affect each other and they don’t rely on results from a previous test.

Integration testing for APIs can be done by simply writing clients for service endpoints. In case of a RESTful API, any HTTP client framework would work fine. A great benefit here is that the client can be written in any language, because it’s basically sending HTTP requests and verifying the response. Frameworks like Ruby based Rspec are becoming popular for this kind of testing. Also, JAX-RS implementations like Jersey provide built-in test frameworks that can be used for this purpose.

Unit and integration tests, when combined, might give you enough confidence to make you think your API is ready for production. But how do we ensure that API would still work fine under production conditions?

3. Performance Testing

This level of testing is done after we know that all the functionality is properly working but we also want to understand how the API would behave under high load. Performance testing is overlooked by a lot of developers but if your API is going to be in use by multiple consumers concurrently, it is very important to do performance testing to detect potential concurrency bugs that would be really tedious to reproduce and troubleshoot after it goes out to production. The idea here is to make API calls from multiple threads in a random order and monitor the behavior of the application for a certain period of time.

Apache JMeter is a perfect tool for this purpose. One can easily create a performance test script for service endpoints and pound a test machine by a surge of requests. Performance testing is not only good for revealing concurrency problems in your application, it also gives you a chance to monitor how your application is using CPU and memory resources on a machine. Monitoring these resources during a performance test would uncover memory leaks, insufficient hardware resources or poor configuration of web servers, load balancers etc. It would be legitimate to state that performance testing is key to building scalable applications.

Monitoring server behavior is considered a part of performance testing and it is always a good idea to use a monitoring tool for that purpose. Although tools like JMeter gives you a detailed report of how the server responded to each request, it doesn’t know anything about the server state at the time of performance testing. For comprehensive monitoring of web requests, database operations, server resources and a lot of other things New Relic is a great product.

It’s also worth mentioning longevity testing, which is a special form of performance testing. While performance tests usually generate a surge of parallel requests to create high load, longevity tests generate less requests but they run for a longer period of time. The point here is to see behavior of application under regular load when it’s in use for a few days or maybe a week. Poorly configured environments might lead to unpredictable system inconsistencies that you would want to research as early as possible.


These three levels of testing for an API should be sufficient to make you write maintainable, functional and scalable APIs. Writing unit and integration tests might seem to increase development time in the beginning, but over time the benefits you gain will be much more visible and they will actually save you a lot of time by preventing bugs from occurring. Also, if you have a comprehensive test suite, you will definitely feel much more confident and not fear refactoring your code just because you don’t want to break something. Happy testing everybody.

Static Keyword and Its Usage in Java

April 1st, 2013 No comments

One of the things that confuse people trying to learn Java is static fields and methods. Although the use of static keyword is very simple, it is easy for beginners to get confused while studying ‘static’. In this article we will look into details of static keyword and clarify the points that confuse developers.

Devamını Oku…

%d bloggers like this: