Background

Opening up Big Data in Finance

The City does not have the strongest record when it comes to transparency. Until 1969 banks were exempt from publicly revealing their true profits, a requirement imposed on other kinds of companies long before then.1 Although more recently the ‘third pillar’ of the Bank for International Settlements (BIS) global standards of best bank practice encouraged financial firms to make more data open, the sector as a whole remains opaque. This situation may have both caused and prolonged the recent financial crisis. Firstly, imperfect and asymmetric information in markets resulted in the mispricing of financial instruments. Once the crisis struck, an orderly re-pricing of these instruments could not occur because investors were uncertain about all the terms and conditions embedded within them.2 According to Donald Kohn, member of the Bank of England’s Financial Policy Committee (FPC), “Complex and poorly understood instruments were at the heart of the crisis. Transparency about these structures—full information about them readily available to all market participants—is required to protect financial stability.”3

Regulatory lessons like this are often learned following financial crises. Following the failure of the City of Glasgow Bank in the nineteenth century, for example, Parliament passed the 1879 Companies Act mandating for the first time that banks publish their balance sheets.4 Once again open data is an important component of post-crisis regulatory reforms. Thus the very first policy recommendation of the FPC in June 2011 advised micro-prudential regulators to make public disclosure of sovereign and banking sector exposures by major UK banks a permanent part of their reporting framework.5

Regulation of the conventional financial sector is following where innovative firms in the private sector are leading. Take ‘peer-to-peer’ (P2P) platforms that have grown rapidly following the financial crisis. In brief P2P platforms are online sites that channel funds from lenders to recipients. At this basic level they are much like banks. However, unlike banks, P2P platforms usually do not invest these funds solely at their own discretion.6 Instead some enable investors to directly make investment decisions by choosing the recipients they fund or allowing investors to define the general social or financial features of the projects in which they want to invest. As a rule of thumb, the specific relationships made between lenders and recipients on P2P platforms are more transparent then when those relationships are indirectly constituted through banks.

Transparency also characterises the approach many P2P platforms take to publicly disclosing data about their business.7 For example, Funding Circle makes its loan-by-loan data publicly available through its website. The data is downloadable in an Excel file and contains every loan Funding Circle has intermediated, whether outstanding or already paid off, since it started trading in 2010. The dataset includes a wide range of information on these loans, including the original credit band of the recipient, purpose of the loan, the loan term and loan amount, the interest rate, and the next repayment date of the loan, among other characteristics. The data are regularly updated, generally by the close of business each day.

This project expands Funding Circle’s open data set with additions from RateSetter and Zopa. Together these platforms comprise over 92 percent of the UK P2P market.8 As far as we are aware, this project is therefore the most comprehensive snapshot of the UK P2P market published to date. This is a tangible benefit for the participating firms in this study because it sheds light on the wider P2P market in which they compete. It may also benefit the wider public by providing a more comprehensive picture of market prices. This information can lead consumers to make better decisions and the P2P market overall to become more efficient.

The P2P market is increasingly important, growing from practically nil before the financial crisis to a cumulative size of an estimated £558 million in the UK at present.9 Two factors account for its rapid growth. First, lenders have been attracted by the relatively high rate of returns available by lending in this market, given the current low yield in bank and other conventional debt instruments. There is also evidence that the P2P market is outperforming equities. For example, according to West One Loans, direct lending to small growth companies via P2P sites has delivered higher returns than investing in smaller companies’ shares in the year to 31 March 2013.10

At the same time, growth of the P2P market has been underpinned by increasing demand from recipients as liquidity from conventional sources has dried up. Last year the Breedon Review estimated demand for new funding by all businesses exceeded supply between £84 billion and £191 billion,11 with small and medium enterprises (SMEs) facing particular difficulties obtaining funds.12 This is a result of banks becoming more risk adverse because of their own and regulators’ concerns regarding the adequacy of their capital to absorb potential losses. That being so, new flows of lending to SMEs have contracted, with a 271 percent increase in the number of unsuccessful SME bank loan applications between 2007 and 2010.13 A number of experimental financing initiatives have developed to fill the funding gap, with P2P platforms at the fore.14

While a number of commentators have highlighted the important role P2P platforms are now playing in funding SME and consumer loans in the UK, the focus of this project is sharper. Specifically we focus on visualising the regional geography of lending in the UK P2P marketing. In so doing, we show which regions are net lenders and recipients, and analyse lending in terms of volume and price.

There are two key reasons for the geographic focus of this project. First, there are longstanding concerns in the UK about a perceived ‘North-South’ divide when it comes to obtaining investment, with the North perceived to be at a disadvantage when compared to regions in the South, particularly London.15 We were therefore interested to understand to what extent, if any, P2P platforms are bridging this regional funding gap. The second reason for focusing on the geography of the market is more pragmatic. Although each P2P platform records a number of characteristics related to the loans they intermediate, they are not recorded in a standard way. For example, while all three P2P platforms document the purpose of the loan, these descriptions vary across platforms and so are not easily comparable.16 Our focus on geography therefore had the advantage that the data was already standardised in terms of postcode.17

As such this project has wider public purchase. In particular the Government has recently expressed a desire for financial firms to disclose postcode-level lending data to assess if there are geographical imbalances as to which regions receive funding and on what terms from banks.18 We hope this project hastens the opening of such data. After all, if P2P platforms can make this data publically available, why not better financially resourced banks?

More generally, this project is a contribution to a growing economic literature which exploits micro-level datasets.19 P2P platforms are particularly rich sites for conducting this kind of research because they are syndicated loan markets in extremis; that is, each loan is funded by multiple (often hundreds) of lenders.20 So although our data set includes ‘only’ a total of 59,851 individual loans, the number of loan parts or individual loan contracts number nearly 14 million (13,924,547 rows).21 Hence ours is ‘big’ data. A broader message of this project is that this sort of granular financial data is collectable in the age of terabyte warehouses, and analytically tractable with visualisation tools we demo here.



  1. FORREST CAPIE and MARK BILLINGS (2004). Evidence on competition in English commercial banking, 1920–1970. Financial History Review, 11, pp 69-103.
  2. BANK OF ENGLAND (2011). Instruments of macroprudential policy, December, p. 28
  3. DONALD KOHN (2011). Enhancing financial stability: the role of transparency. Speech at the London School of Economics, 6 September.
  4. BENEDIKT KOEHLER (2006) History of Financial Disasters, 1763-1995 Volume 2, p. 148.
  5. BANK OF ENGLAND (2011). Record of the interim Financial Policy Committee meeting, June.
  6. For an understanding of the differences between banking and P2P from a macroprudential point of view see IZABELLA KAMINSKA (2013) P2P as full-reserve banking. Financial Times, 14 May.
  7. See, for instance, the highly granular data available for download from US P2P platform Lending Club: www.lendingclub.com/info/download-data.action
  8. Comparison of the three platforms combined loans to date against the whole of the UK P2P market (as at 19/06/2013). See www.p2pmoney.co.uk/companies.htm for details. The total amount in the sample, £378 million, covers a shorter period from October 2010 and excludes areas like the Isle of Man.
  9. Ibid
  10. JONATHAN MOULES (2013). Peer lending beats equity returns. Financial Times. It is also difficult for many ordinary individuals to have the opportunity to directly invest in small, early-stage enterprises via the equity market. This is because investment banks often distribute shares in newly listed companies to their existing network of high net worth individuals or fund manager clients.
  11. DEPARTMENT FOR BUSINESS, INNOVATION AND SKILLS (2012). Boosting Finance Options for Business, March.
  12. ANDY COSH, ALAN HUGHES, ANNA BULLOCK, and ISOBEL MILNER (2009) SME finance and innovation in the current economic crisis. Centre for Business Research, University of Cambridge; NEIL LEE, HIBA SAMEEN and LLOYD MARTIN (2013). Credit and the crisis: Access to finance for innovative small firms since the recession. Big Innovation Centre, June.
  13. YANNIS PIERRAKIS and LIAM COLLINS (2013). Banking on Each Other- Peer-to-Peer Lending to Business: Evidence from Funding Circle, p. 8.
  14. ANDY DAVIS (2012). Seeds of Change: Emerging Sources of non-bank funding for Britain’s SMEs. Centre for the Study of Financial Innovation.
  15. A good primer is DAVID SMITH (1989) North and South: Britain’s Economic, Social and Political Divide. Penguin.
  16. This points to the need for the financial industry to develop a common data classification system. DAVID BHOLAT (2013) The future of central bank data. Journal of Banking Regulation
  17. We did not focus on defaults and rates of recoveries because the industry is still in its infancy. Losses may still materialise and so no reliable inferences about the riskiness of the industry can be made at this point in time.
  18. Community Development Finance Association (2012) Government commits to disclosure of ‘postcode level’ lending data, November 15.
  19. For example, GABRIEL JIMENEZ, STEVEN ONGENA, JOSE LUIS PEYDRO, and JESUS SAURINA (2008). Hazardous times for monetary policy: What do twenty-three million bank loans say about the effects of monetary policy on credit risk-taking? Documentos de Trabajo No. 833, Banco de Espana
  20. Note that the current recipient of cash flows may be different from the original lender because certain P2P platforms allow the original lender to transfer rights to the remaining repayments to other investors. The focus here is on the original lenders’ postcodes rather than geographical information on where and to whom those cash flows now go.
  21. The data underpinning our analysis starts from 1 October 2010, the earliest date at which all three lenders were operating.