1. Backwash
The effect of testing on teaching and learning is known as backwash. Backwash can be harmful or beneficial. If a test is regarded as important, then preparation for it can come to dominate all teaching and learning activities, and if the test content and testing techniques are at variance with the objectives of the course, then there is likely to be harmful backwash. However, backwash can be positively beneficial. For example, if the test is to be administered at the end of an intensive year of English study and will be used to determine which students will be allowed to go on to their undergraduate courses and which will have to leave the university.
Davies (1968:5) has said that ‘the good test is an obedient servant since it follows and apes the teaching’. The proper relationship between teaching and testing is surely that of partnership. It is true that there may be occasions when the teaching is good and appropriate and the testing is not; we are then likely to suffer from harmful backwash. But equally there may be occasions when teaching is poor or inappropriate and when testing is able to exert a beneficial influence. We cannot expect testing only to follow teaching. What we should demand of it, however, is that is should be supportive of good teaching and, where necessary, exerts a corrective influence on bad teaching. If testing always had a beneficial backwash on teaching, it would have a much better reputation amongst teachers.

2. Inaccurate Tests
There are two main sources of inaccuracy. The first of these concerns test content and techniques. If we want to know how well someone can write, there is absolutely no way we can get a really accurate measure of their ability by means of a multiple choice test. Professional testers have expended great effort, and not a little money, in attempts to do it; but they have always failed. The result is a set of poor items that cannot possibly provide accurate measurements. The second source of inaccuracy is lack of reliability. A test is reliable if it measures consistently. On a reliable test you can be confident that someone will get more or less the same score, whether they happen to take it on one particular day or on the next; whereas on an unreliable test the score is quite likely to be considerably different, depending on the day on which it is taken.
Unreliability has two origins: features of the test itself, and the way it is scored. In the first case, something about the test creates a tendency for individuals to perform significantly differently on different occasions when they might take the test. There are some possible features of a test which might make it unreliable, such as, unclear instructions, ambiguous questions, items that result in guessing on the part of the test takers, etc. in the second case, equivalent test performances are accorded significantly different scores. For example, the same composition may be given very different scores by different markers. Fortunately, there are well-understood ways of minimizing such differences in scoring. Most large testing organizations, to their credit, take every precaution to make their tests, and the scoring of them, as reliable as possible, and are generally highly successful in this respect. Small-scale testing, on the other hand, tends to be less reliable than it should be.

3. The Need for Tests
Teaching is, after all, the primary activity; if testing comes in conflict with it, then it is testing which should go, especially when it has been admitted that so much testing provides inaccurate information. Information about people’s language ability is often very useful and sometimes necessary. Within teaching systems, too, as long as it is thought appropriate for individuals to be given a statement of what they have achieved in a second or foreign language, then tests of some kind or other will be needed. They will also be needed in order to provide information about the achievement of groups of learners, without which it is difficult to see how rational educational decisions can be made. Moreover, we have to recognize the need for a common yardstick, which tests provide, in order to make meaningful comparisons. If it is accepted that tests are necessary, and if we care about testing and its effect on teaching and learning, we should do everything that we can to improve the practice of testing.

4. What is to be done?
The teaching profession can make two contributions to the improvement of testing: they can write better tests themselves, and they can put pressure on others, including professional testers and examining boards, to improve their tests.

