IDEAS home Printed from https://ideas.repec.org/a/jss/jstsof/v040i10.html
   My bibliography  Save this article

spa: Semi-Supervised Semi-Parametric Graph-Based Estimation in R

Author

Listed:
  • Culp, Mark

Abstract

In this paper, we present an R package that combines feature-based (X) data and graph-based (G) data for prediction of the response Y . In this particular case, Y is observed for a subset of the observations (labeled) and missing for the remainder (unlabeled). We examine an approach for fitting Y = X? + f(G) where ? is a coefficient vector and f is a function over the vertices of the graph. The procedure is semi-supervised in nature (trained on the labeled and unlabeled sets), requiring iterative algorithms for fitting this estimate. The package provides several key functions for fitting and evaluating an estimator of this type. The package is illustrated on a text analysis data set, where the observations are text documents (papers), the response is the category of paper (either applied or theoretical statistics), the X information is the name of the journal in which the paper resides, and the graph is a co-citation network, with each vertex an observation and each edge the number of times that the two papers cite a common paper. An application involving classification of protein location using a protein interaction graph and an application involving classification on a manifold with part of the feature data converted to a graph are also presented.

Suggested Citation

  • Culp, Mark, 2011. "spa: Semi-Supervised Semi-Parametric Graph-Based Estimation in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i10).
  • Handle: RePEc:jss:jstsof:v:040:i10
    DOI: http://hdl.handle.net/10.18637/jss.v040.i10
    as

    Download full text from publisher

    File URL: https://www.jstatsoft.org/index.php/jss/article/view/v040i10/v40i10.pdf
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v040i10/spa_2.0.tar.gz
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v040i10/v40i10.R
    Download Restriction: no

    File URL: https://libkey.io/http://hdl.handle.net/10.18637/jss.v040.i10?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. repec:jss:jstsof:47:i03 is not listed on IDEAS
    2. repec:cte:wsrepe:ws1450804 is not listed on IDEAS
    3. Álvarez, Adolfo & Peña, Daniel, 2014. "Recombining partitions from multivariate data: a clustering method on Bayes factors," DES - Working Papers. Statistics and Econometrics. WS ws140804, Universidad Carlos III de Madrid. Departamento de Estadística.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:jss:jstsof:v:040:i10. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F. Baum (email available below). General contact details of provider: http://www.jstatsoft.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.