The Institute for Systems Biology RepeatMasker


  • RepeatMasking
  • Protein-based RepeatMasking
  • Pre-Masked Genomes Search
  • Genome Analysis and Downloads
  • Server Queue Status
  • Software

  • RepeatMasker
  • RepeatModeler
  • RMBlast (NCBI Blast for RM)
  • DupMasker
  • Documentation

  • FAQ
  • Privacy Policy
  • RepeatMasker
  • Server Configuration
  • Community

  • Dfam
  • Repbase [GIRI]
  • Tools and Scripts
  • Related Papers
  • Contact

  • Submit Feedback
  • People
  • Stats

  • Sequence Processed:
  • Welcome!

    RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library derived from Repbase sequences ) and Repbase, a service of the Genetic Information Research Institute.

    Latest News

    If you would like to keep up with news and announcements relating to RepeatMasker, you can follow us on Threads: @repeatmasker, or on Bluesky:

    RepeatMasker 4.1.6 Released
    Tuesday, Dec 5, 2023
    A new release of RepeatMasker is available. This version is compatible with the new partitioned FamDB format featured in Dfam 3.8. In this release a database was not included, however the configure script will automatically download a minimal FamDB partition if requested. Additional partitions (grouped by taxa) may be optionally downloaded and configured for use at any time. See the RepeatMasker page for more details.
    RepeatModeler 2.0.5 and RMBlast 2.14.1 Released
    Wednesday, Oct 4, 2023
    The RepeatModeler 2.0.5 update fixes a minor bug that caused failures in "absolute" reproducibility. Prior to this release use of the "-srand" would not guarantee that the outputs consensi.fa.classified and families-classified.stk were exactly the same in sequence and sequence order. It did guarantee that the same samples were drawn from the genome, and that equivalently scoring families were derived at at each stage (although not necessary the same families). In this release secondary sorts were added to guarantee a fixed sort order among equally scoring results, generating exactly the same output files each time the random number generator seed is used.

    RMBlast has also been updated to expose the "db_soft_mask" option in support future development of RepeatMasker. See the pages RMBlast and RepeatModeler for more details.

    RMBlast 2.14.0
    Monday, May 8, 2023
    RMBlast has been updated to the latest version of the NCBI BLAST+ tools (2.14.0), including binaries for 64-bit Mac and Linux. This version also fixes the inconsistent reporting of equivalent scoring alignments when masklevel and multi-threading is used. See the RMBlast page for more details.
    RepeatMasker 4.1.5
    Thursday, Mar 23, 2023
    A new release of RepeatMasker is available. This version is compatible with the latest release of Dfam (3.7) and ships with the curated portion of the database. In addition to several bugfixes we included a new utility to join result files from serial runs of RepeatMasker. See the RepeatMasker page for more details.
    RepeatModeler 2.0.4
    Tuesday, Dec 13, 2022
    A new release of RepeatModeler is available. The masking stages of RepeatModeler have been improved and parallelized to greatly reduce the runtime of the pipeline and minimize rediscovery of families in later stages. In addition, we take advantage of the new query-centric threading features of RMBlast 2.13.0 to further boost the runtime performance. As a result of these speedups we have increased the default sample size to trade some of these gains for added sensitivity. See the RepeatModeler page for more details.
    RepeatMasker 4.1.4
    Monday, Nov 7, 2022
    A new release of RepeatMasker is available. In 4.1.4 we added support for the new RMBlast release (2.13.0), added a new tool to generate TE trackhubs for visualization in the UCSC browser, added additional stats to the *.align file (CpG site count and unadjusted Kimura divergence), and fixed some minor bugs. See the RepeatMasker page for more details.
    RMBlast 2.13.0
    Thursday, Nov 3, 2022
    RMBlast has been updated to the latest version of the NCBI BLAST+ tools (2.13.0), including binaries for 64-bit Mac and Linux. This version introduces query-based threading support. In addition, an experimental Mac M1 arm64 binary package is also provided. See the RMBlast page for more details.
    RepeatMasker 4.1.3-p1 Released
    Friday, Sep 30, 2022
    A new patch release of RepeatMasker is available for download. This fixes a problem in the original 4.1.3 in which data was missing that supports the joining of DNA transposon fragments. It also a fixes a problem in ProcessRepeats that can cause it to crash with the error. See the RepeatMasker page for the full release notes and installation details.
    RepeatMasker 4.1.3 Released
    Monday, Aug 29, 2022
    A new version of RepeatMasker is available. This release adds a new utility to support the creation of trackhubs for a new UCSC genome browser TE visualization, updates GFF output to v3 and fixes several reported bugs in 4.1.2p1. See the RepeatMasker page for changelog and installation details.
    RepeatModeler 2.0.3 Released
    Tuesday, Feb 8, 2022
    A new version of RepeatModeler is available. This release fixes a bug affecting some sequence coordinates in the Stockholm output files. The program now generates an explicit log file with each run to assist with reproducibility and bug reporting.
    RepeatModeler 2.0.2 Released
    Monday, May 3, 2021
    A new version of RepeatModeler is available. This release includes a set of manual curation tools for use with de-novo generated TE libraries, in addition to miscellaneous bugfixes and improvements.
    RepeatMasker 4.1.2-p1 Released
    Thursday, April 1, 2021
    A new patch release of RepeatMasker is available for download. This release fixes a bug in 4.1.1/4.1.2 with the processing of Alu sequences in primates. In these prior releases Alu sequences were being correctly masked, however they were not being automatically compared to the larger Alu subfamily library and did not receive detailed subfamily annotation. See the RepeatMasker page for installation details.
    RepeatMasker 4.1.2 Released
    Friday, March 19, 2021
    A new release of RepeatMasker is available for download. This release fixes some minor issues with RepeatMasker and its auxilary tools. More importantly, this release remedies a problem with its use by RepeatModeler that can cause poor classification performance in RepeatModeler's denovo libraries. See the RepeatMasker page for installation details.
    RMBlast 2.11.0
    Thursday, March 11, 2021
    RMBlast has been updated to the latest version of the NCBI BLAST+ tools (2.11.0), including binaries for 64-bit Mac and Linux. This version introduced opt-out usage reporting, which we have modified in our RMBlast distributions. See the RMBlast page for more details.
    RepeatMasker 4.1.1 Released
    Thursday, September 3, 2020
    A new release of RepeatMasker is available for download. In this version we have added support for Dfam 3.2 and for FamDB ( library files. FamDB is an HDF5 based format which stores family models (HMM and consensus sequences), family metadata, and a subset of the NCBI taxonomy database relevant to the families stored within. In addition, RepeatMasker includes the utility tool which supports a wide range of querying and exporting capabilities on the data stored in this format. See the RepeatMasker page for installation details.
    RMBlast 2.10.0
    Wednesday, January 8, 2020
    RMBlast has been updated to the latest version of the NCBI Blast+ tools (2.10.0), including binaries for 64-bit Mac and Linux. See the RMBlast page for installation details.
    RepeatModeler 2.0 Released
    Wednesday, November 27, 2019
    A new version of RepeatModeler is available with support for structure-based LTR discovery using LtrHarvest and Ltr_retriever. The new workflow developed in collaboration with Jullien Flynn, Andrew Clark, and Cedric Feschotte, vastly improves the quality of the LTR families produced by RepeatModeler. In addition to bugfixes we improved the speed of the masking phase, refactored the configuration system to be more flexible for package managers, and generated both Docker and Singularity containers for simplified installation. A preliminary manuscript has been submitted to bioRxiv [856591].
    RepeatMasker 4.1.0 Released
    Wednesday, October 30, 2019
    A new release of RepeatMasker is available for download. In this version the configuration system has been refactored to make it easier to distribute RepeatMasker via package managers and/or bundle into Docker/Singularity containers. In addition, we have included a useful python tool ( developed by David Ray's lab for manipulating/filtering RM annotation files ( *.out ) and saving output to the BED file format. See the RepeatMasker page for installation details.
    RMBlast 2.9.0-p1 bugfix
    Wednesday, August 7, 2019
    We have identified a bug in NCBI BLAST+ that can occasionally cause a crash or garbled alignments when running rmblastn. We have issued a new patch and reported our findings to NCBI so it can be fixed upstream. See the RMBlast page for installation details.
    RepeatMasker Masking Service Changes
    Monday, May 20, 2019
    As of May 20, 2019 GIRI has rescinded our working agreement allowing the website to offer a repeatmasking service utilizing the RepBase RepeatMasker Edition library. At this time we can only offer masking using the open database Dfam, which starting in 3.0 includes consensus sequences in addition to profile hidden Markov models for many transposable element families. Users requiring RepBase will need to purchase a commercial or academic license from GIRI and run RepeatMasker localy. We are working to expand the Dfam database and invite you to visit Dfam ( ) for more information.
    RepeatMasker 4.0.9
    Tuesday, April 9, 2019
    A new release of the RepeatMasker package is now available. RepeatMasker will now work with with the new combined consensus/HMM Dfam database ( Dfam 3.0 ) and/or user-provided custom libraries out-of-the-box. Dfam is an open database of transposable element (TE) profile HMM models and consensus sequences. The current release (Dfam 3.0) contains 6,235 TE families spanning five organisms: human, mouse, zebrafish, fruit fly, nematode, and a growing number of new species. See the RepeatMasker page for installation details.
    RMBlast 2.9.0
    Friday, April 5, 2019
    RMBlast has been updated to the latest version of the NCBI Blast+ tools (2.9.0). This version is being released as both a patch to the NCBI Blast+ source and as compiled binaries for 64-bit Mac and Linux. Thanks again to NCBI for their help with these efforts. See the RMBlast page for installation details.
    Introducing Dfam 3.0
    Wednesday, March 6, 2019
    The Dfam consortium is excited to announce the release of Dfam 3.0. This release represents a major transition for Dfam from a proof-of-concept database into a funded open community resource. Central to this transition is a major infrastructure and technology update, enabling Dfam to handle the increasing pace of genome sequencing and TE library generation. Equally important, we merged Dfam_consensus with Dfam to produce a single resource for transposable element family modeling and annotation. In doing so, Dfam serves the needs of a broader research community while maintaining a high standard for family characterization (based upon seed alignments), and TE annotation sensitivity. Finally, and most importantly, we are working on making Dfam a community driven resource through the development of online curation tools and direct user engagement. [ read more ].
    To access the database head over to
    RepeatMasker 4.0.8 And Libraries Released
    Wednesday November 21, 2018
    A new RepeatMasker package, Repeat Protein Database, and RepBase RepeatMasker-edition have been released. The Repeat Protein Database grew by over 7400 entries and includes 16.1 million amino acids covering 133 subclasses of transposable elements. For more information on this library see the documentation that accompanies the library. In addition we have updated the RepeatMasker libraries for RepBase ( Repbase RepeatMasker-edition version 20180826, RepBase version 23.08 ). The update includes over 4500 new families from: rice (1652), the western painted turtle (472), the african clawed frog (215), wood tobacco (210), and the sweet potato whitefly (182) among others. The new RepeatMasker package may be downloaded from here. The new RepBase RepeatMasker-edition is available for download at:
    November Update
    Tuesday November 7, 2017
    We received great interest and feedback on the new Dfam_consensus database since our release in the Spring. As a result we have been working on improvements to the tools supporting the database and assisting with data submissions. In addition we have begun opening up the software development repositories for several our projects to foster further collaboration. Here are a few notable updates:
    • Dfam_consensus - Today we released a new version of the database containing several new families for the African Golden Mole and a library for the Collared Flycatcher provided by Alexander Suh.
    • RepeatMasker, RepeatModeler, and Coseg software development repositories are now available on GitHub. Help requests may now be submitted through the GitHub site in addition to the website.
    • RepeatModeler - We have been working hard on eliminating several bugs, and improving the Dfam_consensus import tool based on feedback we have received. The latest version is 10.0.11 and maybe downloaded from: GitHub or
    Introducing Dfam_consensus - Dfam's consensus sequence twin
    Thursday May 18, 2017
    Today we officially announced a new open database of Transposable Element consensus sequences called Dfam_consensus ( ). From the Xfam blog:
    "Since its inception in 2012, Dfam has demonstrated the promise of using profile hidden Markov Models (HMMs) to improve the detection sensitivity and annotation quality of Transposable Element (TEs) families in human[1] and subsequently for four additional reference organisms[2]. Despite these advances, the tools used to discover new families ( de-novo repeat finders ), improve families ( extend, defragment, subfamily clustering ), and classify TE families continue to depend on consensus sequence models. This discordance between methodologies is a direct impediment to Dfam's expansion."
    See the full announcement at:
    RepeatModeler 1.0.9 Released
    Thursday April 6, 2017
    Today we released a new version of the RepeatModeler de-novo repeat identification and library building software suite. This version produces seed alignments in addition to the consensus sequence for each family discovered. The seed alignments provide a richer description of the family allowing for either a consensus sequence or a Profile Hidden Markov Model of the family to be produced. In addition this release includes two new utilities and to support submission of seed alignments to the new Dfam_consensus database from the command-line. The RepeatModeler release is available here: and at our new github site.
    RMBlast 2.6.0 BUGFIX
    Wednesday, March 29, 2017
    We have found a bug in NCBI BLAST which also affects rmblastn. There is a race condition setup when searches are run in multi-threaded mode and with a "-gilist" parameter. At this time only RepeatModeler uses this combination. The problem is intermittent and causes rmblastn to segfault with the message: "Critical: (109.4) CObject::RemoveLastReference: CObject was referenced again". The patch has been updated for 2.6.0 and can be downloaded from the RMBlast page. Thanks to St├ęphanie Vignola for reporting this problem to us.
    RMBlast 2.6.0
    Tuesday, February 7, 2017
    The RMBlast program has been updated to work with the latest NCBI Blast+ tools ( version 2.6.0 ). This update is being released as a patch to the NCBI source distribution with a binary release to follow. This project wouldn't be possible without the support of Christiam Camacho and Tom Madden at the NCBI. Please see our RMBlast page for details on how to install the program.
    RepeatMasker 4.0.7 And Libraries Released
    Wednesday, February 1, 2017
    A new version of RepeatMasker is available for download. This version enables RepeatMasker to take advantage of more than one consensus library at one time and coincides with the pre-release of our new database - Dfam_consensus. Dfam_consensus is a freely available open database of consensus sequences and will be developed in parallel with the Dfam database to provide multiple ways of modeling Transposable Element familes. In addition to the RepeatMasker changes we have updated the RepBase RepeatMasker Edition library which is available from the GIRI website: We expect there to be a few more RepeatMasker/Library updates as we move the new databases forward. The new release can be downloaded here.
    New RepeatMasker Libraries Released
    Monday, September 12, 2016
    A new RepeatMasker Library has been released. The number of rice and Atlantic salmon repeats have roughly doubled to 1500 and 1200 elements, respectively, while an analysis of the zebrafish genome now involves about 2500 consensus sequences. The migratory locust database has grown from 103 to 1128 elements. Among new species or species with previously just a handful of elements are the salmon louse Lepeophtheirus salmonis (660 elements), Arabidopisis thaliana's sibling Arabidopsis lyrata ( 431), the basal flowering plant Amborella (390), the mulberry Morus notabilis (312), the northern pike Esox lucius (311), the kissing bug Rhodnius prolixus (247), the butterfly Papilio polytes (221), the peanut-related herb Arachis ipaensis (211), the polychaete worm Capitella teleta (199), the leech Helobdella robusta (189), the castor-oil plant Ricinus communis (183), the physic nut Jatropha curcas (182), the Indian lotus Nelumbo nucifera (179), the cotton plant Gossypium raimondii (172) and the cichlids (161 elements). The updated RepeatMasker library ( 20160829 ) is available for download at:
    New Coseg Released
    Thursday, November 5, 2015
    In this new release of coseg we have improved the graph visualization and added direct SVG export. This eliminates the dependency on GraphViz for generating graph images. We have added an experimental script to pass the subfamily members through the RepeatModeler consensus refinement process to improve the quality of the output. Lastly, we have fixed a segfault bug and improved the error checking of input files. The new version may be downloaded here.
    RepeatMasker 4.0.6 and New RepeatMasker Libraries Released
    Wednesday, November 4, 2015
    RepeatMasker 4.0.6 is now available for download. This release supports the multi-species expansion to the Dfam database ( Dfam 2.0 ) as well as an update to the RepeatMasker consensus libraries. The new RepeatMasker release is available here and the updated RepeatMasker library ( 20150807 ) is available for download at:
    Introducing Dfam 2.0
    Wednesday, November 4, 2015
    Dfam is growing up. This is the first major expansion of the database since it's inception, adding repeat families from four new organisms: mouse, zebrafish, fruit fly, and nematode. In total, this release includes 2,844 new familes ( 4,150 total ). To access the database head over to [read more ].
    Say hello to Dfam1.4
    Wednesday, May 13, 2015
    With Dfam, we are striving to build models of repeat families that yield high sensitivity without undue false annotation. In this release of Dfam, we have improved our model building strategy to reduce the potential for false annotation, especially in the context of overextending alignments around true interspersed repeat instances...[ read more ]. The new RepeatMasker-ready database can be downloaded from here:
    Dfam 1.3 Released
    Thursday, January 15, 2015
    Dfam, a database of Repetitive DNA element profile hidden Markov models ( HMMs ) was recently updated. The new release synchronizes the Dfam human repeat collection with RepeatMasker's 20140131 library. The new Dfam can be used with the current release of RepeatMasker ( 4.0.5 ). A full description of the update can be found here:
    Scheduled Downtime
    Thursday, January 15, 2015
    The RepeatMasker servers will be offline Friday, January 16th for most of the morning for software maintenance. We expect the servers will be back online by the afternoon.
    Queuing System Repaired
    Monday, September 29, 2014
    Over the weekend the RepeatMasker job queing system experienced some problems and stoped processing job requests. This has been resolved and jobs are now being processed normally.
    New RepeatModeler and RECON Released
    Thursday, May 29, 2014
    We released a new version of the RepeatModeler de-novo repeat identification and library building software suite. This version supports parallel BLAST searches and greatly speeds up the analysis on multiprocessor systems. The new version also adds the ability to restart a crashed RepeatModeler run where it left off. We are also releasing a new version of RECON which fixes a buffer overrun bug ( reported by Stephen Ficklin ). The RepeatModeler release is available here: and our RECON release can be downloaded from here: RECON-1.0.8.tar.gz.
    Updated Pre-Masked Genomes And Landscapes
    Thursday, May 29, 2014
    We have expanded the "Genome Analysis and Downloads" page at the repeatmasker website, adding an additional 30 species. RepeatMasker 4.0.5 ( db20140131 ) has been run on all of these species ( 67 existing and new ) and the repeat landscape graphs have been updated. In addition to displaying the standard repeat landscape we now provide additional summary statistics of the RepeatMasker run and a pie chart of the repetitive fraction of the genome. The page is listed as Genomic Analysis and Downloads under the Service menu at the top left of the main site.
    RepeatMasker 4.0.5 and New RepeatMasker Libraries Released
    Wednesday, February 5, 2014
    RepeatMasker 4.0.5 is now available for download. Enhancements include: Dupmasker support for RMBlast, the Kimura divergence is now calculated for each alignment and placed in the *.algn files, and we now make available our software for drawing repeat landscapes (util/ The new release is available here:

    An updated set of RepeatMasker libraries ( 20140131 ) is also available for download from GIRI at: Additions include improvements and expansion of eutherian to mammalian-wide ancestral repeats, addition of a very detailed set of mouse-specific LINE subfamilies from the Boissinot lab, and new or much expanded libraries created by GIRI for an oddball selection of species, including strawberry, oyster, anole lizard, painted turtle, sea lamprey, acorn worm, and a few fungi.

    RepeatMasker 4.0.3 Maintenance Update
    Thursday, June 20, 2013
    Today we released RepeatMasker 4.0.3. This is a maintenance update which fixes several minor issues in the 4.x releases including: a problem running RM on species names which contain parentheses in the NCBI taxonomy database, missing ID values in rare circumstances, and a problem with Alu refinement when provided with very long sequence names. The new release is available here:

    NOTE: Dfam users will want to update their HMMER distribution to the recently released v3.1b1 available at:

    RepeatMasker 4.0.2 Maintenance Update And New Library Release
    Monday, April 29, 2013
    Today we released RepeatMasker 4.0.2. This is a maintenance update which fixes several problems in 4.0.0/4.0.1. Notably there were issues with human Alu refinement, short input sequences producing "FastaDB::substr - Error index out of bounds!" errors, and lastly an issue with overlapping annotations not being merged. We have also released a new RepeatMasker library ( rm-20130422 ) which includes updates from Repbase as well as four new genome libraries: Gibbon (Nomascus leucogenys), American alligator (Alligator mississippiensis), saltwater crocodile (Crocodylus porosus), and gharial (Gavialis gangeticus). The new release is available here:
    RepeatMasker 4.0.1 Maintenance Update
    Friday, February 22, 2013
    Today we released RepeatMasker 4.0.1. This is a maintenance update which fixes problems observed by some of our users. Notably this fixes error messages produced by the configure script, problems using the older wublast program with RepeatMasker, empty classname columns when custom libraries are used, and noisy perl warnings. Also included in this release are an updated taxonomy database, and an expanded repeat protein database. The new release is available here:
    RepeatModeler 1.0.7 - Update
    Tuesday, January 15, 2013
    Today we released RepeatModeler 1.0.7. This version adds support for the newly released RepeatMasker 4.0 package and the RMBlast 2.2.27+ search engine. The release is available here:
    RepeatMasker 4.0
    Thursday, January 10, 2013
    Today we released RepeatMasker 4.0 adding support for the new nhmmer program and the new profile HMM database of transposable elements - Dfam. Other changes include: a new alignment file format for improved cross referencing of database/annotation identifiers, adoption of TRF for simple repeat identification, improved SINE subfamily refinement, and plenty of bugfixes. The new release is available here:
    NCBI Releases BLAST+/RMBlast 2.2.27
    Friday, September 14, 2012
    In collaboration with NCBI we now have a synchronized release of the RMBlast and NCBI BLAST+ tools. NCBI now hosts the source code and pre-compiled binaries for RMBlast allowing us to support a more diverse set of hardware/software platforms. Please see our RMBlast page for details on how to install the new release with RepeatMasker and RepeatModeler. Special thanks to George Coulouris at NCBI for all his assistance in getting this distribution system setup.
    Dfam: A Database for Profile HMMs of Transposable Elements
    Thursday, September 13, 2012
    The first version of a transposable element profile HMM database was released this month. This represents a major improvement in the characterization of these interesting sequences. Profile methods are known to improve sensitivity over single sequence search, with profile HMMs in particular leveraging the additional information content in position-specific residue and indel variability. Until very recently the use of DNA/DNA profile HMMs to conduct large scale genomic searches was impractical. Advances by the HMMER3 development team at HHMI Janelia farm have made genome scale searches of profile HMMs feasible and enabled the development of this new community resource. A new version of RepeatMasker which uses Dfam and nhmmer will be released in the next few weeks. This work is a collaboration between HHMI Janelia Farm, GIRI ( Genetic Information Research Institute, Repbase ), and the Institute for Systems Biology.

    The official announcement of the resource:

    The database website:

    [Previous News]

    Search the RepeatMasker website:

    - RepeatMasker uses the Dfam database of repeat profile hidden markov models and consensus sequences to conduct searches.
    - RepeatMasker can also make use of Repbase which is a service of the Genetic Information Research Institute. Repbase is a database of repetitive element consensus sequences.
    - Data and computational resources for the Pre-Masked Genomes page is provided courtesy of the UCSC Genome Bioinformatics group.

    Institute for Systems Biology
    This server is made possible by funding from the National Human Genome Research Institute (NHGRI grant # RO1 HG002939).